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Abstract 


We extend a fragment of the programming language ML by incorporating a more general 
form of record pattern matching and providing for user-declared subtypes. Together, these 
two enhancements may be used to support a restricted object-oriented programming style. 
In keeping with the framework of ML, we present typing rules for the language, and develop 
a type inference algorithm. We prove that the algorithm is sound with respect to the typing 
rules, and that it infers a most general typing for every typable expression. 
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Chapter 1 


Introduction 


1.1 Type Inference and Object-Oriented Programming 


During the past decade, a programming style known as “object-oriented” programming has 
become the basis for several popular programming languages, including Smalltalk [(GR83] 
and Ctt [Str86]. In this programming methodology, the basic building blocks are objects, 
which are grouped together in classes of similar objects. There is a hierarchy of classes, 
where more specialized objects of a class constitute subclasses; importantly, operations 
defined on a class can be inherited, or automatically used, by objects of any subclass. 
Furthermore, the representation of objects and the associated class operations can be hidden 
from the outside world. These features of inheritance and information hiding are often 
regarded as central to object-oriented programming. 

Another class of programming languages consists of strongly typed languages, in which 
every expression has a type that can be checked at compile-time. These languages, such as 
Pascal and CLU, are desirable because, among other things, they guarantee that a certain 
class of errors known as type errors will not arise at run-time. Some drawbacks of traditional 
strongly typed languages are that they require the programmer to declare the types of all 
variables, and require the same function to be redefined over different types. However, in 
languages that support type inference, the types of all variables and expressions can be 
inferred by the compiler from the surrounding context, and thus can be omitted by the 
programmer. Moreover, the inferred types are “most general” types, from which all valid 
types can be easily derived. Languages with type inference thus provide greater flexibility 
and expressive power than traditional strongly typed languages, while maintaining the same 
guarantees on run-time behavior. ML [Mil85], a functional language that is based on the 
simply typed lambda calculus, is the paradigmatic language with type inference. 

Combining strong-typing and object-oriented programming clearly has many advan- 
tages, and in the past few years, there has been much research on extending strongly typed 
languages to support an object-oriented programming style [Car84, Mit84, Wan87, Sta88, 
JM88, Rem89] by incorporating some form of subtyping to model subclass relations. How- 
ever, extending languages with type inference in this manner can pose two serious problems: 
there may be no single “most general” typings, or worse yet, the type inference problem 
may become undecidable. 


1.2 Background and Results 


The viewpoint developed by [Car84] and adopted by many others [Wan87, Sta88, JM88, 
Rem89] is to think of an object as a record consisting of a finite set of labeled, typed fields. 
Inheritance is modeled by allowing a function defined on a record to be used automatically 
on a record with more fields. More specifically, the “subclass” relation is that a record type 
y is a subtype of record type z if y has at least all of the fields of z, and perhaps more. 

Although viewing objects as records captures some of the “object-oriented” behavior 
that we want, it does not permit information hiding. For this reason, it seems that a more 
useful perspective is to think of objects as some form of abstract data types. For exam- 
ple, we can think of objects as “ML-style” abstract data types, which have an associated 
name, a hidden representation, and associated operations whose implementation is hidden. 
Importantly, the types of these operations do not reveal the underlying representation of 
the abstract type; they only refer to the abstract type by name. In order to support both 
inheritance and information hiding in this framework, we want objects of any subtype to 
be able to use these operations automatically, without revealing the representation of the 
subtype. Thus, in this scenario, it seems that we need some mechanism to declare explicitly 
the subtyping relations between the names of abstract types in order to get the sort of 
subtyping behavior that we desire. Some relevant work, which we believe can be extended 
to support subtyping between abstract type names, is [Mit84], which develops a type sys- 
tem and algorithm for inferring most general types for pure lambda terms in the context of 
subtype declarations. In [Mit84], the system of subtyping relations between simple types 
is referred to as “atomic subtyping,” and can be thought of as a more general form of 
“bounded quantification” {CW85]. 

There has been quite a bit of work done recently on developing systems with subtyping 
of records. The seminal work on this topic, [Car84], provides a set of typing rules for a type 
system with record subtyping and presents a type-checking algorithm for that language. 
However, the programmer is required to declare the types of all variables, and type inference 
is not provided. 

[Wan87] has the a similar sort of “subtype” relation among record types as [Car84], but 
a somewhat different set of typing rules, and he does provide an algorithm for type inference. 
The main technical innovations are the introduction of “row variables”, which allow more 
flexibility in the typing of records, and an extension of unification [Rob65] that supports 
row variables. However, due to some technical problems, single most general typings do not 
exist for some typable terms in the system of [Wan87]; in fact, the typing algorithm has to 
infer a (finite) set of most general typings. 

In [JM88], we extend the atomic subtyping system of [Mit84] to support base types and 
show that it can be merged in a natural manner with a system of rules for deriving record 
subtyping that is similar to that of [Wan87]. We define our language ML*, which is an 
extension of a kernel of the programming language ML, and a type inference algorithm for 
ML, which uses an extension of unification similar to that of [Wan87]. We show that our 
algorithm infers only types derivable by our type system, and generates single most general 
typings. Importantly, by imposing a minor restriction on records, we correct the technical 
difficulties of [Wan87]; we believe that our paper was the first published work to do so. This 
thesis revises some of the material presented in [J M88] and develops it in more detail. 

The ML* language is based on a functional subset of ML that includes built-in constants, 
records, and function abstraction using pattern matching constructs to decompose records 
into their constituents. For the sake of simplicity, we do not include the polymorphic 


“let” construct of ML, since it can be regarded as syntactic sugar. Although there are 
some algorithmic issues of dealing efficiently with the “let” construct, we do not address 
them here. We do not include variant records (tagged unions), although we believe that 
the subtyping of variants involves issues closely related to the subtyping of records [Car84, 
Wan87, Sta88, Rem89]. We would expect that incorporating variants into our system would 
not cause any serious difficulties, but we do anticipate that there would be some technical 
overhead involved. 

We extend this kernel of ML two ways. We develop a form of “extended pattern match- 
ing” that allows us we type records in a way that allows a function on records to be applied 
to every record containing some minimum set of required fields. More specifically, we in- 
troduce expression variables denoting elements of Wand’s “rows.” These may be bound 
by pattern matching to sequences of labeled values (parts of records). Using this extended 
pattern matching, we may define functions that operate “anonymously” on parts of records 
without knowing the names of the fields involved. In keeping with the spirit of ML, this 
form of extended pattern matching eliminates the need for conditional statements to decom- 
pose arguments passed to functions, and we have developed some rather elaborate technical 
machinery in our system in order to support this expressive power. 

Our second extension to ML involves subtyping relations between atomic types. For 
built-in atomic types such as int and bool, subtyping relationships such as int C real must 
be specified as part of the language design. As mentioned earlier, our treatment of subtyping 
extends the system for atomic subtyping developed in [Mit84]. It is our hope that this form 
of subtyping combined with record subtyping will provide a framework for subtyping among 
abstract types. 

Some examples will illustrate the flavor of ML*. Following Standard ML, function ex- 
pressions (lambda abstractions) are written fn P > M, where P is a pattern, often resem- 
bling a record expression, and M is any expression. For example, a function f incrementing 
the a field of a record may be written as 


fn {a=2;v} > {a=2+4+1;v} (f) 


Essentially, the pattern {a = x;v} matches any record with an a field, binding x to the 
value of a, and binding v to any finite mapping from labels to values extending a = z. One 
type of this function in ML* is 


{a: int; NULL} —> {a: int; NULL} 


where NULL indicates that the record contains exactly the fields specified, since f clearly 
maps records with exactly an a: int field to records with exactly an a: int. Another type of 
f is written 

{a: int; ¥} — {a: int; ¥} 
where ¥ is a “row” variable denoting a sequence of labeled types. The row variable ¥ in 
this type expression is implicitly universally quantified, so this typing “says” that f has 
type {a: int; V}—{a: int; ¥} for any row 4. In particular, f has type 


{a: int, b: bool, c: string}—{a: int, b: bool, c: string} 


since 6: bool, c: string is a possible value for 4. Thus, if we apply f to the record {a = 1,b= 
true,c = “extra”}, then the “extension” variable v is bound to 6 = true,c = “extra”, and 
the result of the function application is 


f({a = 1,6 = true,c = “extra”}) = {a = 2,6 = true,c = “extra”} 


As mentioned earlier, the use of row variables to describe the inherent polymorphism 
of record operations is due to Wand [Wan87]. One technical difficulty is that ¥ in a type 
{a: int; ¥}— ... should not denote a row giving a type to a, since then the type of a might 
be multiply-defined. This leads to certain subtle considerations in our typing algorithm, 
and also the point of departure from Wand’s previous work. While Wand allowed type 
expressions with multiple occurrences of field names (using order of occurrence to determine 
precedence), this in fact leads to a set of most general typings. In a sense, the difficulty 
with Wand’s algorithm begins with his expression language. Wand’s with expression has 
two informal readings: the expression x with a: = 3 has the effect of either modifying the a 
field of z if there is one, or adding one if there is not. From a type checking point of view, it 
is not clear whether we should assume z has an a field or not, and so there are two typings 
to consider. 

We avoid these technical problems associated with Wand’s language by restricting well- 
formed record types not to contain duplicate labels, and by using extended pattern matching 
instead of with. Together, these two restrictions avoid the ambiguity of Wand’s expressions 
by compelling the programmer to choose which interpretation he wants his expression to 
have. For example, the expression Ax. x with a:= 3 in Wand’s system translates to two 
expressions in ML*, namely fn {a = y;u} > {a = 3; u} and fn {u} > {a = 3; u}. The first 
expression constrains the argument record to have an a field, while the second expression 
constrains the argument record not to have an a field. This contrasts with the expression 
in Wand’s system, which allows a record either with or without an a field to be passed as 
an argument. Aside from translating ambiguous expressions in Wand’s language to sets of 
expressions, MLt retains all of the expressive power of Wand’s system, while generating 
single most general typings. We note that this expressive power is due to pattern matching; 
making a similar restriction on the records in Wand’s system would greatly restrict the 
expressive power of that language. For example, the expression Ax. x with a:= r.a+1 
in Wand’s system would not be typable with this restriction, since the result would have 
two a fields. However, by using extended pattern matching, we can write this expression in 
ML? as fn {a= y;u} > {a=yH+1;u}. 

Furthermore, there are some expressions that we can write in ML* that are not express- 
ible in Wand’s language. One such sort of expression that we can write in MLT is 


fn {a = y;u} > {u} 
which allows some fields of a record to be “forgotten”. Another such expression in MLt is 
fn {a = y; EMPTY} = y 


where the EMPTY indicates that the record has ezactly an a field. Although Wand’s lan- 
guage can conceivably be extended to have this expressive power, there are some problems. 
First, it seems difficult, and perhaps even impossible, to define and type a “forget” op- 
eration that forgets all occurrences of a given field, including those fields that have been 
overwritten. Second, an operation “exactly” that restricts a record to have exactly a certain 
set of fields has precisely the same sort of ambiguity with respect to overwritten fields as 
that of the with construct, and thus also leads to a set of most general typings. 


1.3. Related Research and Future Directions 


Between the time that [JM88] was published and the time that this thesis was completed, 
Remy [Rem89] has resolved some of the technical difficulties of [Wan87] without imposing 


any restrictions on duplicate fields in records, and his system also incorporates variants 
and recursive types. The main technical innovation is a new perspective on records in the 
context of a finite set of labels. All records are assumed to contain fields corresponding to 
all labels in the (finite) set; however, only some of the fields need to contain values. The 
fields that do not contain values are considered to be “uninitialized”, but must be written 
out explicitly. However, in a more recent paper [Wand89], Wand extends Remy’s system 
to an infinite set of labels using row variables, so that only the fields that play a role in a 
certain expression need be written explicitly. 

Remy’s type system is richer than both the system of [Wan87] and our system without 
atomic subtyping. In fact, in Remy’s system, all of the typable expressions in the language 
of [Wan87] have single most general typings. Moreover, if we omit atomic subtyping from 
our system, then all expressions that are typable in our system can be translated into Remy’s 
language and typed in his system. In particular, expressions corresponding to “forgetting” 
fields of a record can be translated directly, while expressions denoting that a record must 
have exactly a certain set of (initialized) fields can be translated into a straightforward 
extension of his language. Furthermore, Remy’s system can type expressions that are not 
typable in either the system of [Wan87] or our complete system. For example, his approach 


would assign to the expression ! 


if c then {a = 3,6 = true} else {a = 5} 


the record type such that @ may be considered to be an integer field, and all other fields 
are “uninitialized”, since the records {a = 3,6 = true} and {a = 5} are unifiable in his 
system. However, under both our approach and the approach of [Wan87], these records are 
not unifiable, and thus, this expression would not be typable. 

The system of [Wan87] is a special case of Remy’s system, but the relationship is more 
complicated for MLt. Although any ML* expression that is typable without atomic sub- 
typing can be translated into a typable expression in Remy’s language, our system without 
atomic subtyping is not a special case of Remy’s system. The typing that Remy’s system 
generates on a translated ML* expressions does not translate back into the typing that 
our system yields; specifically, Remy’s typings cannot capture the constraint that records 
cannot have duplicate labels anywhere in the typing derivation of an expression. (As will 
become apparent in following chapters of this thesis, our typings capture this constraint by 
explicitly stating the set of types that appear in the derivation of an expression, but do not 
appear in the final typing statement.) 

In fact, our system without atomic subtyping distinguishes a larger class of type errors 
than that of Remy’s system. For example, our type inference algorithm generates a type 
error if the function fn {u} => {a = 3;u} is applied to a record with an a field, thus 
avoiding the situation in which a programmer unwittingly overwrites a field. Since at this 
moment it is unclear what sorts of languages and type systems are desirable for object- 
oriented programming, it may turn out later on that the larger class of type errors in ML* 
is advantageous to programmers. The above example shows that there is some question as 
to whether the ambiguous expressions in Wand’s system should be typable. 

On the other hand, language designers may consider the system of [Rem89] more desir- 
able than our system, since Remy’s system types more expressions than ours, his system 


* Although an “if” statement is not directly expressible is some of the languages discussed here, it is a 
simple matter to type such a statement. Namely, the if-clause must be a boolean, and both arms of the 
statement must have a “least” type in common. 


incorporates variants and recursive types, and his type inference algorithm uses the usual 
unification algorithm rather than the extension to unification developed in [Wan87]. Thus, 
anyone wishing to implement a language with type inference that supports automatic sub- 
typing between records should probably not consider the type system and type inference 
algorithm presented in this thesis, but instead, should adopt the system of [Rem89] ex- 
tended to an infinite set of labels as in [Wand89]. It is important to point out, though, that 
Remy has not incorporated atomic subtyping into his system. Although it is quite possible 
that atomic subtyping can be merged naturally into Remy’s system, a serious student of the 
subject may wish to read our work in order to get some insights into how it can be done. 

In addition to the three papers discussed in detail above, a host of other papers on this 
topic have been published in the past few years [Sta88, OB88, Wand89, FM88, CCHMO89}, 
presenting different languages, type systems, and type inference algorithms that support 
various “object-oriented” features. We give a brief comparison of these systems here. 

If we only consider the core language consisting of variables, records, lambda abstraction, 
and function application in all the systems, then the system of [Car84] is incomparable with 
the system of [Wan87] and our system. For example, the [Car84] approach would assign 
the expression 

if x then {a:= 3,b:= true} else {a:= 5,b:= Az. x} 


the type {a: int}, since this is the least upper bound of the record types {a: = 3, b: = true} 
and {a:= 5,6:= Az. x}. However, under both our approach and the approach of [Wan87], 
these records are not unifiable, and thus, this expression is not typable. On the other hand, 
in Cardelli’s system, one cannot translate into a typable expression an expression like 


((Az. & with a:= 2.a + 1){a: = 3,1: = y}).1 


where I is an arbitrary label. In Cardelli’s system, the type of a lambda-bound variable 
must be declared, and since there is no mechanism to express the “rest of a record”, the 
type of a lambda-bound record must have a certain fized set of fields. Thus, the argument 
record must be coerced to have that same fixed set of fields, and, in general, applying a 
function to a record with more fields results in the loss of the “extra” fields. 

Although the basic system of [Rem89] discussed earlier in this section is incomparable 
to that of [Car84] for the same reasons discussed above, it seems that an extension to 
Remy’s basic system can type all the expressions typable in the system of [Car84], as well 
as all the expressions typable in the system of [Wan87]. However, in this system, one 
cannot express the notion of restricting a record to have exactly a certain set of fields, 
and thus, this extended system cannot type all the expressions typable in ML* (without 
atomic subtyping). At any rate, this extension to Remy’s system is merely sketched, and 
not worked out in detail, in [Rem89]. 

A type inference algorithm for a language based on [Car84] is presented in detail in 
[Sta88], which generates a set of subtype relations between records that are satisfied when- 
ever a typing for an expression is derivable. He does not, however, present an algorithm 
that checks the satisfiability of sets of such subtype relations, and his principal types may 
be empty. As a consequence, given an untypable expression, his algorithm does not neces- 
sarily indicate that no typing exists for that expression. Furthermore, like that of [Car84], 
Stansifer’s language does not contain any general mechanism for record extension or modi- 
fication, and so his system is also incomparable to that of [Wan87] and [JM88]. 

[OB88] presents a somewhat different core language that introduces sets, joins, and 
projections in the context of subtyping of records, and provides type inference for this 


language. Because of the restrictions upon the join operation, this system is incomparable 
to that of [Wan87], [JM88], and [Rem89], since one cannot translate into the language 
expressions like Az. z with a:= z.a+ 1. Using the join operation, however, it is possible 
to translate expressions like Ax. x with a: = 3 into this language. 

[Wand89] extends the core language of [Wan87] to express record concatenation, and 
gives a treatment of classes and multiple inheritance by using syntactic sugar for this un- 
derlying language. Using the system of [Rem89] extended to an infinite set of labels, a type 
inference algorithm is given that generates a set of most general typings for this language. 
If we consider the core language of [Wand89] without record concatenation, this system is 
a special case of that of [Rem89]. 

Leaving type inference aside, [CCHMO89] presents an extended form of “bounded quan- 
tification” [CW85] that seems useful when recursive type definitions and subtyping are used. 
Leaving subtyping of records aside, [FM88] presents a type inference algorithm based on 
that of [Mit84], and develops procedures for simplifying the set of subtyping assertions that 
are inferred by the algorithm. 

As is apparent from the preceding discussion, there exist a fair number of languages and 
type systems that embody some “object-oriented” features, and these systems differ from 
one another in some non-trivial technical ways. Although such a comparison is useful, it is 
important to ask some broader questions. First, which of the different features is it feasible 
to merge together with the aim of developing a more powerful “object-oriented” language? 
Second, what further extensions should we aim to develop? 

It seems to us that it is feasible to merge most of the approaches discussed above 
into one system with type inference. However, we feel that abstract data types capture 
an important property of object-oriented programming and should be incorporated into 
such typed languages. Other important features that should be considered are multiple 
inheritance, “self”, and “method specialization.” 

In this entire discussion, we have only considered type systems and type inference algo- 
rithms, and have not discussed whether these type systems are themselves “reasonable”. It 
would be interesting and worthwhile to look at the semantics of these systems, both from 
an operational and denotational point of view. Such an investigation does not appear in 
this thesis, and the interested reader is directed towards [Kam88, BL88, Red88, Coo89, 
BCGS89]. 


1.4 Outline of Thesis 


Chapter 2 presents the syntax of MLt. The typing system of the language, which is 
specified by a set of typing axioms and inference rules, is presented in Chapter 3. The 
notion of substitutions, instances, and “most general” typings is also developed in that 
chapter. Finally, Chapter 4 presents the algorithm for inferring a most general typing for 
any expression in MLt. In that chapter, we prove that the algorithm is sound with respect 
to the type system, in the sense that whenever the algorithm gives term M type o, the 
assertion that M has type o is provable from the typing rules. We also show that the 
algorithm infers the most general type for any typable term. Specifically, if we can prove 
M has type o using the typing rules, then the algorithm succeeds in finding a typing for M 
which is “more general” than o in a precise sense developed in earlier chapters. 


Chapter 2 


ML*: Types, Syntax, and 
Notation 


2.1 Types 


We begin with an infinite set of type variables and some fixed set of base types. We fix this 
set to be {int, bool, real, string}; however, our type system and algorithm can be easily 
extended to handle a larger set of base types. 

There are two forms of structured types: function types and record types. Function 
types are written using — as usual, so that o — 7 is the type of functions from o to T. 
As mentioned in Chapter 1, we believe that variants (tagged unions) involve issues closely 
related to records [Car84, Wan87, Sta88, Rem89], but we do not consider them here. 

Record types are written in a slightly unusual way. Intuitively, record types are finite 
functions from labels to types. In our system, part of this finite function can be named but 
unspecified, so that it can be passed around and referred to without being fully specified 
until a later time. To support this naming of parts of record types, we follow [Wan87] and 
introduce an infinite set of row variables, which denote finite functions from labels to types, 
and the rew constant NULL which denotes the empty function. 

In order to make our type system and algorithm more understandable in a technical 
sense, while continuing to write our examples in ML*, we use two different notations for 
record types. In the formal notation, summarized in Table 2.3, record types are pairs (h, Z) 
where h is a finite function from labels to types and Z is either a row variable or NULL. 
We call a record type in which Z is a row variable an extended record type, and a record 
type in which Z is NULL fired. 

In ML*, record types are written {lj:7, ... ,ln:T;Z}. For example, the record type 
{a:o,6:7;%} is intuitively the finite function that maps a to o and 6 to rT, combined with 
the as yet unspecified function denoted by the row variable ¥. On the other hand, the 
record type {a: 0, b:7; NULL} is intuitively the finite function that maps exactly a to o and 
b tor. 

Since we use finite functions to write records, all of the labels in a record must be distinct. 
This distinguishes our type expressions from the expressions used in [Wan87], and leads to 
an important and slightly subtle complication in our system. To avoid the algorithmic 
inefficiencies of Wand’s algorithm [Wan87], we must assume that the domains of h and Z 
(as finite functions) are disjoint. This complicates substitution of record expressions for row 
variables, since it only makes sense to replace a row variable by a record type which has 


base types 

type variables 
function types 
empty fixed records 


empty extended records 
- sln:T™;NULL} fixed records 
v:glgetne att extended records 
where 1; £1; for alli # j 


Table 2.1: Summary of Types 


an appropriately limited domain. To avoid incorrect substitutions, our typing algorithm 
therefore maintains restrictions on row variables. 
The types are summarized in Table 2.1. 


2.2 The Language Syntax 


ML? is derived from a subset of ML with pattern matching, function abstraction, records, 
ground constants, and built-in (possibly higher-order) constants. For the ground constants, 
we assume the set Z of integers, the set # of reals, the set {true, false} of booleans, and 
an infinite set of strings. Any closed term may be incorporated into MLt as a built-in 
constant. We note here that although there is no explicit construct for declaring recursive 
functions, it is possible to define, as a built-in constant, an operator fix that returns the 
fixed-point of a function. Thus, using fiz, we can define recursive functions. 

The two new features of ML* are a more powerful (extended) form of pattern match- 
ing, which permits subtyping of records, and structural subtyping, which allows subtyping 
between base types. 

Extended pattern matching is achieved using an infinite set of extension variables, which 
play a role in the language similar to that of row variables in the type system. Formally, 
extension variables denote finite functions from labels to expressions, while the extension 
constant EMPTY denotes the empty function. Analogous to record types, record expres- 
sions are pairs whose first component is a finite function from labels to expressions and 
whose second component is an extension variable or EMPTY. As with record types, we 
call a record expression whose second element is an extension variable an extended record 
expression, and one whose second element is EMPTY fized. We do not allow record expres- 
sions with duplicate fields, thus imposing the same sort of restrictions on record expressions 
as on record types. 

Patterns are simply a proper subset of expressions, consisting recursively of ground 
constants, variables, fixed record expressions, and extended record expressions. However, 
all variables and extension variables within a pattern must be distinct. 

In ML", as in ML, pattern matching provides a limited form of equality testing over 
the structure of types, while pattern matching over constants provides an equality test for 
ground constants. For the sake of simplicity in this thesis, we do not provide a polymorphic 
equality function since such an extension, while fairly straightforward, would add a layer of 
technical complication to our system. 


Expressions 


ground constants 
built-in constants 
x variables 
fn P>M abstraction 
MN application 
{EMPTY} empty fixed records 
{u} empty extended records 
{ly = My, ... lp = Mz; EMPTY} | fixed records 
{ly = Mi, ... ,lp = Mgju} extended records 


where 1; #1; for alli #7 


Table 2.2: Summary of Expressions and Patterns 


The language is summarized in Table 2.2. 


2.3. Meta-notation 


Since there are quite a few syntactic categories to MLt, the meta-notation we use is summa- 
rized in Table 2.3. This notation will be used extensively in our proof rules and algorithm, 
while the examples will be written using ML* syntax. 
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expressions 
ground constants 
built-in constants 
variables 
extension variables 
variables and extension variables 
extension variables and EMPTY 
finite functions from labels to expressions 
record expressions 
patterns, namely, 
abstraction-free, application-free expressions 
with no repeated variables 
or extension variables 


type expressions 

base types 

type variables 

row variables 

row variables and NULL 

finite functions from labels to types 
record types 


Table 2.3: Summary of Notational Conventions 
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Chapter 3 


The Typing System 


3.1 Overview 


As mentioned in the introduction, the main purpose of this thesis is twofold. First, we would 
like to define a typing system for ML* that supports the form of subtyping discussed in 
Chapter 1 and Chapter 2. Since the legal terms in ML are exactly the pre-terms in MLt 
syntax that have a typing derivable through this typing system, the typing system serves to 
define the language. Secondly, we would like to develop a type inference algorithm that is 
“equivalent” to the typing system in the sense that it generates a typing for exactly those 
terms that have a typing derivable through the typing system. Moreover, for any typable 
expression M, we want the typing algorithm to generate a “most general” typing, whose 
set of “instances” is exactly the set of provable typings for M. 

In this chapter, we present the typing system for ML* and give a precise definition of 
instances and most general typings in the context of this typing system. Our typing system 
consists of two separate but related sets of typing rules, one for deriving subtype assertions, 
which are subtype relations between types, and the other for typing expressions. 

The chapter is organized as follows. First, in Section 3.2, we present the derivation 
tules for subtype assertions. Section 3.3 defines the form of typing statements, which are 
formulas that capture the information we need in order to derive typings for expressions, 
and Section 3.4 presents a set of seemingly natural derivation rules for typing statements. In 
Section 3.5, we develop the “instance” relation in some detail and show that, in the typing 
system of Section 3.4, provable typings are not closed under the instance relation. As we 
show, this implies that “most general” typings do not exist for some expressions for which 
a type is derivable in the typing system. We examine the difficulty, and then modify the 
typing system a bit in order to get the R-typing system, which has the technical properties 
that we need. Section 3.6 is devoted to proving Theorem 3.6.7, the main theorem of the 
chapter, which shows that, in the R-typing system, all instances of a provable typing are 
provable. Section 3.7 develops the notion of most general typings in the context of the R- 
typing system, and Section 3.8 examines how the the R-typing system relates to the original 
typing system. 


3.2 Proof System for Subtyping Assertions 


As mentioned in the introduction, subtype assertions are of the form o C rT. We require 
that the subtype relation, C, act like a pre-order on types. In particular, we follow [Mit84] 
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and introduce into our system for deriving subtype assertions the following three axioms 


and rules. 
ref ofa 
se ae 
trans aS 
acy 
eGo 77 GF 
arrow 


7 t 
o-T Co-T 


We note that the function type constructor — is antimonotonic in the first argument and 
monotonic in the second. 
We have a similar rule for the “subtype” relation on records. 


(¢, Z) C (¢, 2’), Wl € dom(h). h(l) C Oy 


(A, Z) C (h’, 2’) om(h) = dom(h’) # ¢ 


record 


We say that a subtype assertion o C 7 is provable from a set C’ of subtype assertions, 
written Cl o C7, if o Cr can be derived from assertions in C' using only the typing rules 
above. We will write C + C’ ifC to Cr for every 90 C7 € C’. It is easy to show that F is 
a transitive relation on sets. 

In our typing system, we restrict C’ to consist only of atomic subtype assertions, which 
are subtype assertions of a certain simple form. We say that a subtype assertion o C 7 is 
atomic if 


e o and 7 are both type variables 
e o and 7 are both ground types 
e Each of o and 7 is a ground type or a type variable 


e Each of o and 7 is of the form (¢, Z) 


We say that C 2s an atomic set or that C is atomic if it consists only of atomic subtype 
assertions. 


3.3. Syntax of Typing Statements 

In the typing system for expressions, the typing statements have the form 
C,AD M:a 

where 


e Cis an atomic set of subtype assertions. 


e A is a finite set of associations z:o between variables and types, and associations 
u: (h, Z) between extension variables and record types. 


e M is an expression and o is a type expression. 
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The typing C,A > M:o may be read as, “Given the subtype assertions in set C’ and 
the assignment of types to variables and extension variables described by A, the expression 


M has the type a.” 
To allow a slightly modified form of “ML-style” polymorphism, we consider all type 
and row variables that occur in C and o but do not occur in A to be implicitly universally 


quantified. For example, the typing 
{sCth,@ Ding >2:s—-t 


should be read as, “For all types s and t such that s C t, the identity function has the 
type s—t.” We note that this form of polymorphism does not correspond to the sort of 
polymorphism that the ML “let” construct provides. 


3.4 Proof System for Typing Statements 


In this section, we present a seemingly natural set of typing rules for expressions. We begin 
with the typing axioms for ground constants, which are as follows. 


int C,A > b:int whenever b is an integer 


The axioms for reals, booleans, and strings are analogous, with int replaced by real, 
bool, and string, respectively. 
The typing rules for variables and application are standard. 


var C,A>D 2:0 whenever z:0€ A 


. C,AD M:ao-7, C,ADN:o 
EP C,A> MN: 


Due to pattern matching, the typing rule for lambda abstraction is somewhat more 
complicated than usual, since lambda abstraction may bind arbitrary patterns, in addition 
to variables. The typing rule for function expressions must therefore take the typing of 
patterns into account. There is a subtle but important reversal of subtype assertions in 
the abstraction rule which seems best illustrated by a simplified example. The pattern 
{a =2,b= y} has typing 


{Cth terse} D> {a= b = yh dart,b:2}, 


which means that if we give x and y values of type s, and s C f, then the pattern has type 
{a:t,b:t}. However, when we lambda abstract over the pattern, we actually bind a value 
to {a = z,b = y} and access its components using variables x and y. So, in effect, we use 
the typing statement about the pattern “backwards.” The simplest technical adjustment 
seems to be to reverse the set of subtype assertions before combining them with the typing 
statement for the function body. If the function body is simply x, for example, then this 
gives us the typing 


{tC s},@D> fn fa=2,b=y} > 2: {a:t, b:t} — 8 


while writing s C t instead would give us an incorrect typing. (An alternative is to reformu- 
late the typing rules for patterns in a way that more accurately reflects their use. However, 
the current formulation has algorithmic advantages due to the similarities with expressions.) 
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In order to state the typing rule for lambda abstraction, we need to develop some 
notation and some definitions. Since the notion of free variables is important in the typing 
rule for abstraction, we define vars(M) inductively as: 


er er 
e vars((f, EMPTY)) = Uledom(s) vars( f(0)) 

© vars((f,u)) = Usedoms) vars(f(J)) U {u} 

¢ vars(MN) = vars(M)U vars(N) 

© vars(fa P + M) =vare(M)~— vare(P), where “—" is set difference 


We define vars(A) = {w | w:o € A}, where w ranges over variables and extension variables. 
We use the notation C°? for the opposite set of subtype relations, 


Cafe Crliroee Gh 


and A[A’] for the result of modifying A so that every variable and extension variable men- 
tioned in A’ has the type specified by A’ rather than A. More precisely, A[A’] = A; U A’, 
where A = A; U Ag and Ag = {w:a | w:o € A and w € vars(A')}. 

The typing rule for abstraction is then as follows. 


C°?,A'> P:0, C,A[A] D> M:r 


ale C,ADfinP>M:o-7r 


vars(A’) = vars(P) 

The condition vars(A’) = vars(P) ensures that every variable occurring in A’ must 
occur in P. This is important, since we want A[A’] to modify A only on variables that 
become bound by fn. 

Proceeding to the typing rules for records, we first introduce the two axioms for records 
whose first component is the empty function. 


recl C,AD (¢, EMPTY): (¢, NULL) 


rec2 C,A D> (¢,u): (h, Z) whenever u:(h,Z)E€ A 


The third typing rule is for records whose first component is a non-empty function. Since 
extension variables may be assigned arbitrary record types, we have a minor complication 
since we do not want duplicate field names. To express the rule succinctly, we define the 
partial operation + on finite functions from labels to types as 


hy(l) if 1 € dom(hy) 
hy + ho = As.t. h(l) — ho(l) ifle dom(h2) if dom(h1) nN dom(h2) = f) 
undefined otherwise 


If dom(h;) and dom(hz) are not disjoint, then hy + hg is undefined. 
The third rule for records is then as follows. 


C,A>D (¢, £): (ha, Z), Vl € dom(f).C, AD f(I): Ai(D) 


i l-f d 
C,A > (f,E): (ha + ha, Z) all typings well-forme 


rec3 
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where dom(f) = dom(h1) # 


It is worth saying a few words about the form of rec3. Generally, axioms and infer- 
ence rules are written as schemes, which are “abbreviations” for all of their substitution 
instances. However, not all substitution instances of rule rec? are well formed, since + 
may be undefined. Therefore, we interpret this rule “scheme” as indicating that any mean- 
ingful (well-formed) substitution instance may be used in deriving types for terms. As a 
consequence of the partiality of +, typing derivations that use rec? may not be preserved 
by arbitrary substitutions, but only those which, throughout the proof, yield well-formed 
types (i.e., types that do not anywhere contain record types that have duplicate labels). 

Our typing system also allows any set of closed expressions to be built in as term 
constants, as long as each is given a proper type. For example, the type for the language 
construct if can built into the typing system as a term constant. A reasonable typing for 
if is 

¢,¢ D if: bool-t-t—-t 
It may seem at first glance that this typing does not capture any of the power that subtyping 
provides us; however, as we will see in Section 3.5, we can derive from the above typing 
that 

EC,AD if:sty tpl" 


where C = {s C bool, t; Ct, t2 Ct, tC t’}, 


and A is any type environment. Furthermore, as we shall see in Section 3.7, this second 
typing is the “most general” typing of if. 

It is useful to point out that an operator fiz that returns the fixed-point of a function 
can also be defined as a built-in constant. A reasonable typing for fiz is 


$,¢@ D fix: (tt) 


Again, we can derive the “most general” typing of fiz from this above typing. 
The axiom for term constants is 


const C,AD q: St, whenever S is defined on Ty 


where 7, is the built-in type for the constant qg, and the application of the substitution S 
on Tg results in a well-formed type. The idea here is that since all type variables in 7, are 
considered to be implicitly universally quantified, we allow different substitution instances 
of the type of a built-in constant to be used within the typing of any expression. 

Although this typing axiom gives us the desired typing for if, we may want more flex- 
ibility in building in the types of other built-in constants. More specifically, we may wish 
to incorporate into our system a typing for a (closed) term q; that is, we wish to build in 
a set C, as well as a type rT, for gq. Unfortunately, our system, as exists, does not seem 
to be powerful enough to handle these sorts of typings in general without sacrificing the 
property of most general typings. However, it seems that we can build in a large class of 
such typings into our system as axioms, while maintaining most general typings; we omit 
further discussion here. 

Finally, in order to make use of subtyping, we have the rule 


C,ADM:0, ChoCr 
C,AD M:r 


COETCE 


By this rule, an expression may be considered to be of the type of any “supertype” of 
its actual type. 
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3.5 Substitutions, Instances, and the R-typing System 


Although the typing system in Section 3.4 seems to capture the sort of subtyping that we 
want to achieve, it turns out that, for some typable expressions, most general typings do not 
exist. As mentioned in the overview to this chapter, the problem with the typing system 
arises because, for some expressions, provable typings are not closed under the “instance” 
relation. In this section, we develop the notion of instance in some detail and define most 
general typings using this instance relation. We discuss the difficulty in the typing system 
of Section 3.4, and then modify the typing system in order to correct this difficulty. We call 
the modified typing system the R-typing system. 

We would like to begin by straightaway discussing the notion of “instance”, but as the 
basic part of the instance relation consists of a substitution, which is a finite function from 
type variables to types and from row variables to record types, we need to first discuss 
substitutions in some detail. Because a substitution may map a row variable to a record 
type (h, Z) where h is a non-empty function, applying a substitution naively may result 
in a record type containing duplicate labels. To prevent this, we will treat the process of 
applying a substitution to a type as a partial operation. 

Using the operation + developed in the previous section, we define the partial operation 
So, the action of a substitution S on a type o, where we write S[t] = o if S maps ¢ to a, 
and S[V]= 0 if S maps ¥ to oc. Sa is defined inductively as: 


e Sc=c 


; st={ Slt] if t € dom(S) 


t otherwise 


_ J S[X] if XY € dom(S) 
as oe (¢,4) otherwise 


S(¢,NULL) = (¢, NULL) 


(h; 5 + left(S(¢,Z)), right(S(@,Z))) if h; S is defined and 
e S(h,Z)= the + operation is defined 
undefined otherwise 


e S(o->7) = So-Sr 


It is important to note that, due to the partiality of +, the action of S on a record 
type is undefined if an ill-formed function from labels to types is created anywhere within 
the resulting record type. Since h may (recursively) map a label to a record type, the 
composition of a finite function h from labels to types with a substitution S$, written h;S, 
is also a partial operation. We define h; S as: 


Ah; S$ = 


undefined otherwise 


h! s.t. hi(l) = { SD): « MVE dom(By cece ama: Sth ED) wauenned 
undefined otherwise 


Similarly, composition of a substitution S with another substitution T, written $;T7, is 
a partial operation. We define 5; 7 as: 
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U if Vt € dom(S'). T(S[t]) is defined, 
wy ie VX € dom(S). T(S[A]) is defined 
undefined otherwise 


where dom(U) = dom(S) U dom(T), and 


vi _ J L(SId) if t € dom(s) 
[i] = TIt] if t € dom(T) and t ¢ dom(S) 


and 
_ J T(S[A]) if ¥ € dom(S) 
Ula] = T[A] if XY € dom(T) and ¥Y ¢ dom(S) 


We note that if 5;T is defined, then (S;T)o = T(Sc), or both are undefined. We also 
note that h;(5;T) = (A; S);T, or both are undefined. Likewise, composition of substitutions 
is associative. 

We also define the action of a substitution S on type environments and on sets of 
possibly non-atomic subtype assertions. A substitution S applied to A is the assignment 
SA = {w: So | w:o € A}, where w ranges over variables and extension variables, if S' is 
defined on all such o. We consider SA undefined otherwise. Similarly, the application of 
substitution S to a possibly non-atomic C is the set SC = {Sm © St2 | m1 € Te € Ch, 
provided $7; is defined for all such 7;. Again, we consider SC’ undefined otherwise. 

In defining our instance relation, we could follow [Mit84] and say that a typing C’, A’ D 
M:o’ is an instance of a typing C,A > M:o by a substitution S if S is defined on C, A, 
and o, and 


C’t SC, A'D SA, and o' = So 


However, since in the above definition, SC’ may be a non-atomic set of subtype assertions, 
we cannot use the same sort of reasoning about SC as we do for C' and C’. Thus, in order 
to simplify the proofs of our main theorems, we use a seemingly more complex notion of 
instance that uses, in place of SC, an atomic set SC that is computable and that is closely 
related to SC. We show that this definition is, in fact, “equivalent” to the one above in 
the sense that C’, A’ > M:o’ is an instance of C,A > M:o by a substitution S under the 
above definition iff it is an instance under this definition. 

The basic idea is as follows. We want to define S eC as a “least” atomic set that 
implies SeC't SC; that is, any atomic set C’ that implies SC’ should also imply S eC. It 
then follows that, for any atomic set C’, C’ + SC iff C’ + S eC, and thus, the two above 
definitions of instance are equivalent. 

The reasoning we will use in defining the e operation is as follows. We will first show 
that, in order for an atomic set C’ to imply SC, SC must contain only subtype assertions 
that have a certain form. We will then show how to compute, from any subtype assertion 
o CT of this form, a “least” atomic set that implies o C r. Thus, any atomic set C’ that 
implies o C 7 must also imply this computed atomic set. Using these ideas, we will define 
S eC so that it has the properties outlined above. 

We begin by showing that an atomic set can imply only matching subtype assertions 
o CT, where o and 7 have the same syntactic form. More precisely, we define the relation 
match on types recursively as: 


e ¢; matches to 
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@ cy, matches cg 
e ¢ matches c, and c matches t 


(@, Z) matches (¢, Z’) 


e o matches o’ and r matches 7’ iff gr matches o/—-7’. 


dom(h) = dom(h’) and VI € dom(h). h(1) matches h’(1) 
iff (h, Z) matches (h’, 2’) 


We say that a subtype assertion o C 7 is matching if o and r match. It is easy to see 
that match is an equivalence relation on types. 


Lemma 3.5.1 [fC lb oCrT, where C is atomic, then ao matches T. 


Proof. We prove the lemma by induction on the length of the derivation of o € r 
from C. 

If the proof of o ¢ 7 from C requires no steps, then 0 C r € C and is thus an atomic 
subtype assertion. Therefore, o matches T. 

Otherwise, the proof must have ended in an application of either (trans), (arrow), or 
(record). If the final step was (trans), then C + o C 7 must have followed by the antecedents 
oC y and y C7 for some y. Since the proofs of o C y and y C 7 are shorter, we may 
assume that o matches y and y matches 7. Thus, by the properties of match, o matches T. 

If the final step was (arrow), then o must be of the form 0 = 01-02 and r must be 
of the form tr = 7-72, and C + o C r must have followed by the antecedents 7, C oy 
and o2 C T9. Since the proofs of 7; C ao, and og C T? are shorter, we may assume that 7, 
matches o, and o2 matches 72. Thus, by the properties of of match, o matches 7. 

If the final step was (record), then o must be of the form o = (h, Z) and r must be of the 
form 7 = (h’, Z’), where dom(h) = dom(h’) # $. Moreover, C + o Cr must have followed 
by the antecedents (¢, Z) C (¢, Z’), and VI € dom(h). h(1) C h’(1). Since the proofs of the 
h(l) C h’/(1) are shorter, we may assume that Vl € dom(h). h(1) matches h’(1). Thus, by the 
properties of match, o matches 7, proving the lemma. 2 

We now wish to show that for any matching subtype assertion o C 7, we can compute 
an atomic set C that implies o C r, and such that for any atomic set C’ that implies 0 C 7, 
C” also implies C. However, in order to justify “decomposing” o C T into atomic subtype 
assertions, we will need to use the following lemmas about the form of derivable subtype 
assertions. 


Lemma 3.5.2 For any atomic set C, Ck oj 02 C 1179 iff Ck 7 Co, andC ko C 
TQ. 


Proof. The proof of this lemma is exactly that of [Mit84]. 

One direction is a direct consequence of rule (arrow): if Cf 7 Co, and Ck o2 € 72, 
then CF 01-02 C T]1-—72. It remains to prove the converse. 

We show that if C + o C r for any o and 7 of the form o = 01-02 and T = 71-72, 
then there is a proof of o C r from C that ends with an application of the rule (arrow). 
We argue by induction on the length of the proof of o C r from C.. If the proof is one step, 
then this step must be an application of rule (arrow) and so, trivially, there must be a proof 
ending in an application of (arrow). 
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For the inductive step, assume that we have a proof whose final step is a use of rule 
(trans) from antecedents o C y and y C r. By Lemma 3.5.1, we know that 7 has the form 
Y = 71-72. Since the proofs of o C y and and y C7 are shorter, we may assume that we 
have proofs of these inclusions ending in applications of rule (arrow). Thus, 


C P41 C oi, 02 Co. TC iy Vo © 


By rule (trans), we have Ck 71 C o; and Cf a2 C 72, which proves the lemma. 7 
We have a similar lemma for records. 


Lemma 3.5.3 For any atomic set C, C & (h,Z) C (h’,Z'), where h is non-empty, iff 
dom(h) = dom(h') # ¢ and Ct (4, Z) C (¢, 2’) and VI € dom(h). CF ACL) C h’(L). 


Proof. 

The proof of this lemma is analogous to the proof of Lemma 3.5.2. 

One direction is a direct consequence of rule (record): If dom(h) = dom(h’) # ¢@ and 
CE (¢,Z) C (¢, 2’) and Wi € dom(h). Ck A(1) C h’(D), then CF (h, Z) C (h’, 2’), where 
h is non-empty. It remains to prove the converse. 

We show that if C + o C 7 for any o and + of the form o = (h,Z) and r = (h’, 2’), 
where h is non-empty, then there is a proof of ¢o C 7 from C that ends with an application 
of the rule (record). We argue by induction on the length of the proof of a Cr from C. If 
the proof is one step, then this step must be an application of rule (record) and so, trivially, 
there must be a proof ending in an application of (record). 

For the inductive step, assume that we have a proof whose final step is a use of rule 
(trans) from antecedents o C y and y C r. By Lemma 3.5.1, we know that o, y, and r 
match, and that y has the form y = (h”, Z”), where dom(h) = dom(h”) = dom(h’). Since 
the proofs of o C y and and y C 7 are shorter than that of o C rT, we may assume that we 
have proofs of these inclusions ending in applications of rule (record). Thus, CF (¢, Z) © 
($,2"), Ck (6, 2") C (¢, 2"), and 


for alll € dom(h), CF h(l) € h"(1) and CF AM (1) C h'(2). 


By rule (trans), C F (¢, Z) C (¢, 2’) and Wi € dom(h). C+ h(l) C h’(1), which proves the 
lemma. a 
Using the above lemmas, we can now prove the property we desire. 


Lemma 3.5.4 Let o and t be matching type expressions. There is an atomic set C = 
ATOMIC(o Cr) with C+ o Cr and such that if C’ is any atomic set of subtype assertions 
with C’k ao Cr, thenC’ RC. 


Proof. 
We define ATOMIC(o C 1) as: 


e Ifo Cr is an atomic subtype assertion, then ATOMIC(o Cr) = {o Cr}. 
e Ifo C7 is of the form 01-02 C 4-72 then 


ATOMIC(o Cr) = ATOMIC(7, € 01) U ATOMIC(02 € 72). 
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e Ifo Cr is of the form (h, Z) C (h’, 2’), where h and h’ are non-empty, then 


ATOMIC(o Cr)= [J ATOMIC(A(I) C A’(1)) U ATOMIC((4, Z) C (¢, 2). 
ledom(h) 


Let C = ATOMIC(o Cr). We first prove, by induction on the structure of 0 C rT, that 
CkoCr and that C is atomic. 

If ¢ C rT is atomic, then, trivially, Ct o Cr and C is atomic. 

If o is of the form o = 01-02 and 7 is of the form T = 71->7T2, then we can assume 
inductively that ATOMIC(™ € 01) m © o; and ATOMIC(a2 C 72) F a2 € 72. Thus, 
Ck rm Co, and Cf o2 C 72, and so, by rule (arrow), C + o C r. To show that C is 
atomic, we can assume inductively that both ATOMIC(7 © 01) and ATOMIC(o2 € 72) 
are atomic; thus, so is C. 

If o is of the form o = (h,Z) and 7 is of the form 7 = (h’, Z’), where h and h’ are 
non-empty, then, since o C rT is matching, dom(h) = dom(h’). Since dom(h) # ¢, we can 
assume inductively that 


ATOMIC((4, Z) C (, 2’) F (6, Z) € (6, 2” 


and that 
for all 1 € dom(h), ATOMIC(A(L) C AND) FAD) C h'(1). 


Thus, 
Ct (d, Z) C (¢, 2’) and VI € dom(h). CF A(l) C h'(1). 


Therefore, by rule (record), C + o C r. To show that C is atomic, we can assume induc- 
tively that ATOMIC((¢, Z) C (¢, Z’)) is atomic, and that VI € dom(h), ATOMIC(A(I) C 
h/(L)) is atomic. Therefore, so is C. 

Since o C r is matching, one of the above cases must hold, thus proving the first part 
of the lemma. 

To show that this set is minimal, we again proceed by induction on the structure of 
subtype assertions. We wish to prove that, for any atomic set C’, if C’ k o C 7, then 
C’t ATOMIC(a Cr). 

If o Cr is atomic and C’ ko Cr, then C’ + ATOMIC(o C 1) trivially. 

If o C 7 is of the form 01-02 C T->7T2, then, by Lemma 3.5.2, C’ 7 C o; and 
C'’ + o2 C T2. We can thus assume inductively that C’ | ATOMIC(7m C oj) and C’ + 
ATOMIC (a2 C 72). Therefore, C’ FC. 

If o is of the form o = (h, Z) and 7 is of the form r = (h’, 2’), where h and h’ are 
non-empty, then, by Lemma 3.5.3, dom(h) = dom(h’) # ¢, C’ + (¢,Z) C (¢, 2’), and 
Vl € dom(h). C’ + h(l) C h’(1). Since dom(h) # ¢, we can assume inductively that 


C’ F ATOMIC((¢, Z) € (¢, Z’)) and WI € dom(h). C’ # ATOMIC(h(I) € A’(1)). 


Thus, C’ FC. 

Since o C 7 is matching, one of the above cases must hold, thus proving the lemma. m 

Before proceeding to develop our notion of instance, we need some definitions. We 
say that a substitution S is a matching substitution for a set C of possibly nonmatching 
assertions T, C T2 if SC is defined and every St, C St2 € SC is matching. Furthermore, 
we say that a substitution S respects a set C of atomic subtype assertions T; C T2 if SC’ is 
defined and every St, C St_ € SC is matching. 
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Using the results of Lemma 3.5.4, we now define 


SeC= (J ATOMIC(Se ¢ Sr) 
oCrEec 


for any S' that respects an atomic set C. The following lemma states that this definition of 
¢ gives us precisely the desired behavior. 


Lemma 3.5.5 If a substitution S respects an atomic set C, then there is an atomic set 
SeC with SeCt SC and such that if C’ is any set of atomic subtype assertions with 
CE SC, then-Coh SS «C. 


Proof. Let SeC = Uscrec ATOMIC(So C Sr). The proof of the lemma follows 
easily by Lemma 3.5.4. 

Since S respects C, we know that Vo C r € C. So matches Sr. Then, by Lemma 
3.5.4, we know that Vo C r € C. ATOMIC(So C Sr) is atomic and that Vo C r € 
C. ATOMIC(Se C Sr) + So C Sr. Thus, S eC is atomic. Moreover, since SC = 
Uscr( So < Sr), 

|J ATOMIC(So ¢ St) + SC 
oCrEec 


proving the first part of the lemma. 
To prove the second part of the lemma, we note C’ + SC implies that Vo Cr € C.C’F 
So C Sr. By Lemma 3.5.4, we know that Vo Cr € C, C’+ ATOMIC(So C Sr). Thus, 


c't (J ATOMIC(Se C Sr), 
oCrec 


which proves the lemma. : 
We note that, in our original definition of instance, if C’ F SC then, by Lemma 3.5.1, 
S must respect C. Thus, putting it all together, we say that 


Definition 3.5.6 A typing C’, A’ D M:o’ is an instance of the typing C,A D M:o by a 
substitution S if S respects C’, is defined on A and o, and 


C’F SeC, A’ DSA, and o'=Soa 


We often simply say one statement is an instance of another, without mentioning the 
substitution involved. 

We wish to prove ultimately that, for any typable expression M, our type inference 
algorithm infers a “most general” typing; that is, a provable typing for M whose set of 
instances is exactly the set of provable typings for M. To this end, we would like to follow 
[Mit84] and show that every instance of a provable typing is provable. Then, simply showing 
that the typing inferred by the algorithm is provable would give the proof of one direction. 
However, it turns out that the partiality of substitution complicates the situation. 

Since a derivation of a typing C,A > M:o may involve type expressions that do not 
appear explicitly in this statement, a substitution which is defined on C, A > M:o may not 
be defined on all typings used in its proof. To see why this is a problem, consider the typing 


0,0: {XV} D (fn {a = 23 u} > {u}){a = 3; 0}: {¥} 
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which follows from the typing 
0,0: {XV} D fn {a = 2; u} > {u}: {a: int; VY} {4} 


by rule app. For the sake of simplicity, we do not consider the set of subtype assertions, 
since they are not relevant to this discussion. The typing statement for the first term 
is syntactically well-formed if we substitute any record type for VY. However, the second 
typing, which is needed to prove the first, becomes ill-formed if we replace 1 by a: bool; ), 
say, since this gives a two types. Thus, for the expression (fn {a = z;u} > {u}){a = 
3;v}: {4}, there is a instance of a provable typing that is NOT provable. It is easy to 
show by contradiction that this implies that no most general typing exists for the above 
expression. 

This example illustrates the fact that in order to have every instance of a provable typing 
be provable, we must impose additional conditions on substitutions. Specifically, we must 
keep track of type expressions that appear in the derivations, but not in the final typing 
statement. 

Fortunately, the required bookkeeping is not as complicated as it might appear at first 
glance. A careful analysis of the typing rules reveals that for all but one rule, any sub- 
stitution defined on the consequent of the rule (the typing statement below the horizontal 
line) will necessarily be defined on all assertions in the antecedent. For rule coerce, this is 


a consequence of the following lemma. 


Lemma 3.5.7 Suppose that a substitution S respects an atomic set C, and C to CT or 
Cra. 


1. If S is defined on o, then S is defined on Tr. 
2. If S is defined on o andr, then So matches Sr. 


Proof. We note that by Lemma 3.5.1, 0 and r must match. We also note that, for 
any S and Z, S(¢, Z) is always defined. We prove the lemma by induction on the length 
of the derivation of o C7 or rt C o from C. 

If the derivation of o C r from C' requires no steps, then o C 7 is of the form tf; C ta, 
eC t,t Ce, ce C eg, or (¢,Z) C (¢, 2’). To prove (1), we note that the application of 
S to a type variable, ground type, or a record type whose first component is the empty 
function is always defined. To prove (2), we note that since the derivation requires no steps, 
a CT EC. Since S respects C, So matches Sr trivially. The proof of (1) and (2) for the 
case where C+ 7 C a is analogous. 

If the final step of the derivation of o C 7 is (trans), then, C+ o C r must have followed 
by the antecedents 0 C y and y C7 for some y. To prove (1), we note that since the proof 
of o C ¥ is shorter than that of o C rT, we can assume inductively that S is defined on y. 
Then, since the proof of y C 7 is also shorter than that of o C 7, we can assume that S$ 
is defined on r. To prove (2), we note that since the proof of o C ¥ is shorter than that 
of o C T, we can assume inductively by (1) that S is defined on y. We can then assume 
inductively by (2) that So matches Sy, and that Sy matches Si. Therefore, by transitivity, 
Sa matches Sr. The proof of (1) and (2) for the case where C+ + C a is analogous. 

If the final step of the derivation of o C 7 is (arrow), then o must be of the form 
o = 0,-—0 and 7 must be of the form 7 = 71-472, and C' + o Cr must have followed by 
the antecedents 7, C a; and o2 C 72. Since, by assumption, Sco is defined, so are Sa, and 
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Sag. Since the proofs of 7; C a1 and o2 C 72 are shorter that that of o C T, we can assume 
that St, and S72 are defined and that S7, matches So, and So matches Sr2. Thus, 7 
is defined, and So matches Sr, proving (1) and (2). The proof of (1) and (2) for the case 
where C' + r Ca is analogous. 

If the final step of the derivation of a € 7 is (record), then o must be of the form 
o = (h,Z) and r must be of the form 7 = (h’, Z’), where dom(h) = dom(h’) # ¢. Also, 
CtkoCr must have followed by the antecedents (¢, Z) C (¢, Z’) and VI € dom(h). h(1) C 
hi(l). 

To prove (1), it suffices to show that VI € dom(h’). S(h’(1)) is defined and that dom(h’)n 
dom(left(S(¢, Z’))) = @. Since, by assumption, Sa is defined, then 


dom(h) N dom(le ft(S(¢, Z))) = ¢, 


and WI € dom(h). S(A(1)) is defined. Since, for all | € dom(h), the proofs of h(1) C A’(I) are 
shorter than that of o C r, we can assume inductively that VI € dom(h). S(h’(1)) is defined. 
Since the proof of (¢, Z) C (¢, Z’) is shorter than that of o C r, we can assume inductively 
by (2) that $(¢, Z) matches $(¢, Z’). Thus, 


dom(left(S(¢, Z))) = dom(left(S(¢, 2'))), 


which proves (1). 

To prove (2), we note that we can assume inductively that for all 1 € dom(h), S(h(1)) 
matches S(h/(1)), and that 5(¢, Z) matches $(¢, Z'). Therefore, if we let hy = le ft(S(¢, Z)) 
and hz = left(S(¢, Z’)), then by the definition of match, dom(h,) = dom(h2), and VI € 
dom(h,). hi(/) matches Ag(/). Careful inspection of the definition of match shows that 
this implies that So matches Sr, as desired. The proof of (1) and (2) for the case where 
Ct r Ca is analogous, which proves the lemma. . 

In most other typing rules, every type expression appearing in the antecedent occurs 
(possibly as a subexpression) somewhere in the consequent. The only exception is in the 
application rule app, where the so-called cut formula (o, as the rule is written) is eliminated. 
(This will come as no surprise to proof theorists.) Thus, in order to determine which 
substitutions preserve provability of typing statements, we only need to keep track of the 
cut formulas used in proofs. 

Since the set of cut formulas will be used as a restriction on allowable substitutions, we 
adopt a notation that combines typing statements with restrictions. We will call a formula 
R|C,A > M:o combining a typing statement with a set R of type expressions a restricted 
typing statement. Moreover, we say that 


Definition 3.5.8 C,A > M:o is R-provable if C,A > M:ca is provable by a derivation 
using only types from R as cut formulas, and we write | R|C,AD M:o. 


Notice that since there are no function applications (and therefore no cut formulas) 
in patterns, there is no need to compute any restriction on substitutions for patterns — 
every well-formed substitution instance of a provable pattern typing is provable. However, 
due to the algorithmic advantages that arise from the similarities between patterns and 
expressions, we use restricted typing statements for patterns as well, noting here that R 
will always be empty. 

Using restrictions, we may now define a useful instance relation on restricted typing 
statements. 
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Definition 3.5.9 A restricted typing R'|C', A! > M:o' is an instance of R|C,AD M:a 
by substitution S if S respects C, is defined on R, A, and oa, and 


R'D SR, C'+SeC, A'DSA, and o'=Soa 


Again, we often simply say one restricted typing statement is an instance of another, 
without mentioning the substitution involved. 

While we have defined “provability” of a restricted typing statement R|C,A > M:oa 
using the typing system for statements without restrictions, the typing system of Section 
3.4 can be reformulated to prove restricted statements directly. The changes are relatively 
minor. Essentially, restrictions are added to each statement in each rule, with the cut type 
(formula) added to the restriction set in the consequent of rule app. It is easy to show that 
a restricted statement is provable in this augmented system iff it is R-provable. Through 
the rest of the paper, we refer to this augmented system, which is summarized in Table 3.1, 
as the R-typing system. 


3.6 Properties of the R-Typing System 


This section is devoted to proving Theorem 3.6.7, the main theorem of this chapter, which 
shows that, in R-typing system, all instances of a provable typing are provable. In order to 
prove this theorem, we will need to use a number of technical properties of the R-typing 
system. Thus, before presenting the proof of Theorem 3.6.7, we first discuss these technical 
properties in some detail. 

First, we want to show that, if a restricted typing R|C, A D M:a is provable, then if we 
add “extra” type restrictions to R, “extra” subtype assertions to C’, or “extra” variable-to- 
type or extension variable-to-record type associations to A, the resulting restricted typing is 
also provable. As a consequence of these properties, we can prove Theorem 3.6.7 by simply 
showing that + R|C,A D> M:a implies that + SR|SeC,SAD M:So for any substitution 
S that respects C' and is defined on R, A, and o. 

The following three lemmas formally state the properties mentioned above. 


Lemma 3.6.1 [ft R|C,A > M:o and R' D R, thent R’'|C,AD M:o. 


Proof. We prove this lemma by induction on the length of the derivation of + 
R|C,AD Mo. 

If the derivation requires no steps, then  R|C,A D M:o follows from either (int), 
(real), (bool), (string), (var), (rec1), (rec2), or (const). The lemma holds trivially since, 
for any R’,+ R’|C,AD M:o. 

If the final step of the derivation is (coerce), then + R|C, AD M:o must have followed 
by the antecedents k R|C,A D> M:y and Ct y Co for some y. Since the proof of this 
is shorter, we can assume inductively that + R’|C,A D> M:y. By (coerce), we have that 
FE R'|C,AD M:o. 

If the final step of the derivation is (app), then M is of the form M = M’'N’. Thus, 
for some y, R = R, U {y}, and + R|C,A D> M:o must have followed by the antecedents 
t Ry |C,AD M':y-0 and + R,|C,AD N’:¥. Since the proofs of these are shorter, and 
since R’ D R,, we can assume that + R’|C,A D> M':y-—0 and + R'|C,AD N’:¥. Since 
7 € R’, we can infer that + R’|C,A D M:co by (app). 

If the final step of the derivation is (abs), then M is of the form M = fn P > M’ and is 
of the form 0 = 01-02. Moreover,+ R|C,A > M:o must have followed by the antecedents 
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int R|C,A Dd b: int whenever 6 is an integer 


real R|C,A > b: real whenever 0 is a real 
bool R|C,A > b: bool whenever 6 is a boolean 
string R|C,A D b: string whenever 0 is a string 


const R|C,ADq: Sr, whenever S is defined on 7,, where 7, is the built-in type for q 
var R|C,A>D a«:o whenever z:0 € A 


R|C,AD M:o->7, R|C,AD No 


meD RU {o}|C,A> MN: 

R|\C%?,A' D> P:o, R|C,A[A] > M:r 
b ? ? ’ Ny 
abs ~ RiGaot Poise vars(A’) = vars(P) 
eens R|C,AD M:0, CkoCr 

R|C,AD M:r 

recl R|C,AD (¢, EMPTY): (¢, NULL) 
rece R|C,AD (¢,u): (h, Z) whenever u: (h,Z) € A 
aay RIC,AD (4, B): (ha Z), We dom(f). RICA D S:4() ay typings well-forme 


R|C,AD (f, E): (hi + he, Z) 


where dom(f) = dom(hi) # @ 


Table 3.1: The R-Typing System 
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t R|C°?, A’ > P:o, and+ R|C, A[A] D M's a2 for some A’ such that vars(A’) = vars(P). 
Since the proofs of these are shorter, we can assume that + R’|C%?,A’ D> P:o, and F 
R’'|C, A[A] > M’: 02. By (abs), we have that + R’|C,AD M:o. 

If the final step of the derivation is (rec3), then M is the form M = (f,E) and a 
is of the form o = (hy + ho, Z). Moreover, | R|C,A D M:o must have followed by 
the antecedents dom(f) = dom(h,) # ¢,+ R|C,A D (¢, E): (ha, Z), and Vl € dom(f). F 
R|C,AD>D f(l):hi(1). Since the proofs of these are shorter, we can assume that + R’|C, AD 
(o, E): (he, Z), and that VI € dom(f). | R'|C,A D f(J):hi(l). By (rec3), we can infer 
thatt R’'|C, A > M:o, which proves the lemma. " 


Lemma 3.6.2 [f+ R|Ci,A D M:o and C2 + Ci, where Cy is an atomic set, then F 
R|C2,AD M:o. 


Proof. We prove this lemma by induction on the length of the derivation of + 
R|Ci, AD M:o. 

If the derivation requires no steps, then + R|C,,A D> M:c@ follows from either (int), 
(real), (bool), (string), (var), (rec1), (rec2), or (const). The lemma holds trivially since, 
for any C2,+ R|C2,A D M:o. 

If the final step of the derivation is (coerce), then, + R|C,, A D> M:o must have followed 
by the antecedents | R|C,,A > M:y and Ci + y C o for some y. Since the proof of 
t R|C,,A D> M:7 is shorter, we can assume inductively that  R|C2,A D M:+¥. Since 
C2 + Cy, we know that C2 + y Co. By (coerce), we have that k R|C2,A D> M:o. 

If the final step of the derivation is (app), then M is of the form M = M’N’, and 
t R|C,,A D> M:o must have followed by the antecedents | R|Ci,A > M’:y-0 and 
t R|C,,A > N':¥ for some y. Since the proofs of these restricted typing statements are 
shorter, we can assume that t R|C2,A D> M’:y-0 and + R|C2,A D N':7. By (app), we 
have that t R|C2,A D M:o. 

If the final step of the derivation is (abs), then M is of the form M = fn P => M' 
and o is of the form 0 = 01-02. Moreover, + R|C,,A > M:o must have followed by 
the antecedents F R|C,°?, A’ > P:o, and + R|C\, A[A’] > M’': 02 for some A’ such that 
vars( A’) = vars(P). Since the proofs of these restricted typing statements are shorter, we 
can assume that F R|C2°?, A’ > P:0; and + R| C2, A[A’] D M’: 02. By (abs), we can infer 
that F R|C2,A D M:o. 

If the final step of the derivation is (rec3), then M is the form M = (f, F) and a is 
of the form o = (hy + he, Z). Moreover, | R|C,,A > M:o must have followed by the 
antecedents dom(f) = dom(hi) # ¢,+ R|C1,A D (¢,£): (he, Z), and VI € dom(f). 
R|C,A D f(D): hi(Z). Since the proofs of these restricted typing statements are shorter,we 
can assume that t R| C2, A D (¢, E): (ho, Z), and that 


VI € dom(f). | R|C2,A D f(1): hi(1). 


Thus, by (rec3),+ R|C2,AD M:o. = 

We prove a slightly more complicated lemma for type environments. In our proof of 
Theorem 3.6.7, we only need to use the second property outlined below; however, in the 
proofs of our main theorems in Chapter 4, we will need to use all three of the these properties. 


Lemma 3.6.3 Suppose thatt R|C,AD M:o. 


1. If w occurs free in M, then w:t € A for some T. 
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2. If w:7 € B for every w:7 € A with w free in M, then|+ R|C,BD M:ca. 


3. There is a proof of} R|C,B > M:c that is of the same length as that oft R|C,AD 
M:o. 


Proof. We note that proving part (1) is equivalent to proving that iff R|C,AD M:c, 
then vars(M) C vars(A). For part (2), it is sufficient to prove that iff R|C,A D> M:o 
and if Ay C B, where Ay = {w:o | w:o € A and w € vars(M)} then | R|C,BD M:o. 
We prove the lemma by induction on the length of the derivation of } R|C,AD M:o. 

If the derivation requires no steps, then the axiom used is either (int), (real), (bool), 
(string), (var), (rec1), (rec2), or (const). Since vars(b) = vars((¢, EM PTY)) = vars(q) = 
¢, part (1) is trivially true for all the cases other than (var) and (rec2). To prove (1), 
we note that if the rule applied is (var) or (rec2), then vars(M) C vars(A) by the axiom 
itself. Similarly, part (2) is trivially true if the rule applied is any other than (var) or 
(rec2), since | R|C,B > M:o for any B. To prove (2), we note that if the axiom used 
is (var), where M = x, then Ay = {x:o}. Since, by assumption, Ay C B, we can 
infer that + R|C,B D> M:o by (var). If the axiom used is (rec2), where M = (4,1), 
then Ay = {u:o}. Since, by assumption, Ay C B, we can infer that F R|C,BD M:o 
by (rec2). For part (3), we note that t R|C,B D> M:a holds by the same axiom as 
- R|C,AD M:o. 

If the final step of the derivation is (coerce), then + R|C,A > M:o must have followed 
by the antecedents t R|C,A > M:y and C + y C o for some y. Since the proof of 
t R|C,A D> M:7 is shorter, we can assume inductively that (1) holds true for M. To prove 
(2), we can assume inductively that + R|C,B D> M:7. Since C+ y © a, we can derive 
by rule (coerce) that + R|C,B > M:c. To prove (3), we can assume inductively that the 
proof of t R|C,B D> M:7 is of the same length as that of  R|C,A > M:7. By (coerce), 
we can infer that (3) holds for + R|C,BD M:a. 

If the final step of the derivation is (app), then M is of the form M = M’N’, and 
+ R|C,A > M:o must have followed by the antecedents + R|C,A D> M':y-o and + 
R|C,AD N':7¥ for some 7. Since the derivations of these are shorter, we can assume that 


vars(M’) C vars(A) and vars(N’) C vars(A). 


Thus, vars(M) C vars(A). To prove (2), we note that Ay C B implies that Ay C B and 
An © B. We can thus assume inductively that k R|C,B > M’:y-0 and + R|C,BD 
N':y. By rule (app), we can infer that k R|C,B > M:o. To prove (3), we can assume 
inductively that (3) holds fort R|C,B D> M':y-0 and for + R[C,BD N’:7. By rule 
(app), we have that (3) holds fort R|C,BD M:o. 

If the final step of the derivation is (abs), then M is of the form M = fn P > M’ 
and o is of the form 0 = 01-02. Moreover, + R|C,A > M:o must have followed by 
the antecedents + R|C?, A’ D> P:o, and + R|C,A[A’] D M’:a2 for some A’ such that 
vars(A’) = vars(P). Thus, we can assume inductively that vars(M’) C vars(A[A’]). We 
note that by definition, A[A’] = A, U A’, 


where A = Aj U A2, Ap = {w:0 | w:0 € A and w € vars(A’)}, 
which implies that A[A’] — A’ = A,. Therefore, 


vars(M’) — vars(P) C vars(A;) C vars(A), 
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as desired. To prove (2), we note that Aas C B implies that (A[A])m C BIA’, and 
thus we can assume inductively that  R|C,B[A’] D M’:o2. By (abs), we can infer 
that + R|C,B > M:a. To prove (3), we can assume inductively that (3) holds for F 
R|C,B[A > M':02. By (abs), we have that (3) holds for R|C,BD M:c. 

If the final step of the derivation is (rec3), then M is the form M = (f,E) and a is 
of the form o = (hi + h2,Z). Moreover, + R|C,A > M:o must have followed by the 
antecedents dom(f) = dom(hi) # ¢,+ R|C,A D (¢,E): (ha, Z), and Wl € dom(f). F 
R\C,AD>D f(D: hi(l). We can assume inductively that 


vars((, E)) C vars(A) and that Vi € dom(h). vars(f(1)) C vars(A). 


Thus, vars(M) C vars(A), proving (1). To prove (2), we note that Ay C B implies 
that Aig.n) C B and that WI € dom(h). Agi C B. We can thus assume inductively that 
+ R|C,BD (¢, E): (he, Z), and that 


Vl € dom(f). | R|C,BD f(D): hai()). 


By rule (rec3), we can infer that + R| C,B D> M:c. To prove (3), we can assume inductively 
that (3) holds for + R| C, BD (¢, E): (he, Z) and for VI € dom(f). | RIC, BD f(D: Ai()). 
By (rec3), we have that (3) holds for + R|C, BD M:a, proving the lemma. " 

The next three lemmas present some properties about substitutions that we will need 
in the proof of Theorem 3.6.7 for technical reasons. 


Lemma 3.6.4 Suppose that o, y, andt match. If C is an atomic set, andC + ATOMIC(¢ C 
7) and C+ ATOMIC(y € 7), then Ct ATOMIC(o C 1). 


Proof. The proof proceeds by induction on the structure of 0, y, and r. If o C y and 
y Cr are atomic, then ATOMIC(o C 7) = {o C y}, and ATOMIC(y C r) = {7 € FH. 
Thus, Cl {o C r}, as desired. 

For the remaining cases, the proof follows by simple induction. - 

Using the above lemma, we show the following property. 


Lemma 3.6.5 Suppose that a substitution S respects an atomic set C and 1s defined on o 
andr. IfChoaCr, then SeCt ATOMIC(Se C Sr). 


Proof. We prove the lemma by induction on the length of the derivation of o C r 
from C. 
If the derivation of o C 7 from C' requires no steps, then o C 7 € C’. Thus, by definition, 


ATOMIC(Sa C St) ES eC. 


If the final step of the derivation of o C 7 is (trans), then C + o CT must have followed 
by the antecedents o C y and y C 7 for some 7. By Lemma 3.5.7, we have that S is defined 
on y. Since the proofs of o C y and 7 C 7 are shorter than that of o C 7, we can assume 
that 

SeChFATOMIC(So C Sy) and that SeC ft ATOMIC(SY C Sr.) 


By Lemma 3.6.4, we can infer that ATOMIC(S eC} Soa C Sr). 

If the final step of the derivation of o C 7 is (arrow), then o must be of the form 
o = 0,0” and T must be of the form rT = 71-72, and C_ + o Cr must have followed by 
the antecedents 7; C a, and a2 C Tp. Since, by assumption, So and Sv are defined, so are 
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So 1, So2, Sm, and Sr2. Since the proofs of 7; C 01 and a2 C 72 are shorter that that of 
o CT, we can assume that 


SeCh ATOMIC(S7, C S'o1) and SeC rk ATOMIC(Sa2 Cc ST). 


Thus, SeC Ff ATOMIC(So C Sr). 

If the final step of the derivation of o C 7 is (record), then o must be of the form 
o = (h, Z) and r must be of the form 7 = (h’, Z’), where dom(h) = dom(h’) # ¢. Since the 
final step of the derivation is (record), we can assume that C + o Cr must have followed 
by the antecedents (¢, Z) C (¢, Z’) and VI € dom(h). h(L) C h’(1). Since the proofs of these 
inclusions are shorter, we can assume inductively that 


Wl € dom(h). Se CF ATOMIC(S(A(1)) C S(h’(D))) 


and 


S eC ATOMIC(S(¢, Z) € S(¢, Z')) 


Let (hs, Zs) = S(, Z), and let (h,’, Zs’) = S(¢, Z’). Since So and St match, we can infer 
that dom(h,) = dom(h,'). Then, 


VI € dom(h,). S eC F ATOMIC(h,(I) € hs'(1)), 


and SeC +t ATOMIC((¢, Zs) C (¢, Z,’)). Thus, Se Ct ATOMIC(Sea C Sr). = 
The third property is as follows. 


Lemma 3.6.6 Suppose thatt+ R|C,AD P:o where vars(A) = vars(P). If S 1s a substitu- 
tion that respects C and is defined on o, then S is defined on A. 


Proof. The proof is by induction on the length of the derivation of # R|C,AD P:o. 
We write Ap for the type environment A restricted to the free variables of P; more precisely, 
Ap = {w:a|w:o € A and w € vars(P)}. 

If the derivation requires no steps, then + R|C,A D P:o follows from either (int), 
(real), (bool), (string), (var), (rec1), or (rec2). For all of the cases except (var) and (rec2), 
the lemma holds trivially since A = ¢. If the axiom used is (var), where P is of the form 
P =a, then A = {z:o}, and so, by assumption, S is defined on A. If the axiom used is 
(rec2), where P is of the form P = (¢,u) then A = {u:o}, and again by assumption, S is 
defined on A. 

If the final step of the derivation is (coerce), then | R|C, AD P:o must have followed 
by the antecedents | R|C,A D> P:y and Ct y C ao for some 7. By Lemma 3.5.7, S is 
defined on y. Since the proof of k R|C,AD P:7 is shorter that that of # R|C,AD P:a, 
we can assume inductively that S is defined on A. 

If the final step of the derivation is (rec3), then P is the form P = (f, E) and o is of the 
form o = (hy + he, Z). Moreover, + R|C,A D> P:o must have followed by the antecedents 
dom(f) = dom(hi1) # ¢, + R|C,A D (¢, E): (he, Z), and Wl € dom(f). | R|C,A D 
f():hi()). By Lemma 3.6.3, we know that 


FR | C, A(g,E) 2) (9, E): (ha, Zi); 
and that 
VI € dom(f). RIC, Asay > f(D: Ail). 
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Moreover, by Lemma 3.6.3, we can infer that the proofs of these are shorter than that of 
F R|C,AD Pro. 

We can thus assume inductively that Vl € dom(f). S is defined on Asi) and that $ is 
defined on Aig,z). Since 


A= LJ As) U A(y,E), 
leédom(f) 


S is defined on A, which proves the lemma. 7 

Having developed all the necessary technical machinery, we now prove the main theorem 
of the chapter. This theorem shows that the R-typing system has the important property 
that the original typing system lacked: namely, that all instances of a provable typing are 
provable. In Chapter 4, we will make use of this property to show that the algorithm infers 
most general typings. 


Theorem 3.6.7 Suppose that R'|C’, A’ > M:0o' is an instance of R|C,A D> M:ca. If 
F RI|C,AD M:o, then + R'|C',A'D M:0!. 


Proof. We prove the lemma by induction on the length of the derivation of - 
R|C,AD Mo. 

We note that if R’| C’, A! > M:o’ is an instance of R|C,A > M:o, then there exists a 
substitution S such that S respects C and is defined on R,A and a, and such that 


R'DSR, C’t+ SeC, A’DSA, and o’ = So. 


If the derivation requires no steps, then R|C,A > M:o must have followed by a typing 
axiom. For the purposes of our proof, it suffices to show that SR| SeC,SA > M: So, since 
we can then conclude by Lemma 3.6.3, Lemma 3.6.2, and Lemma 3.6.1 that + R’|C’, A’ D> 
M:o'. If the typing axiom used is (int), (real), (bool), (string), or (rec1) then, trivially, 
t SR|SeC,SA D> M:So by that same axiom. If the axiom used is (var) or (rec2) 
+ SR|SeC,SA > M:So follows by that same axiom, since So € SA. 

Otherwise, the axiom used was (const). Thus, for some substitution T that is defined 
on T,, we have that o = T7,, and so, So = S(Tr,). Since, by assumption, So is defined, we 
can assume that 5(J7'r,) is defined as well. We would now like to show that So = (T; S)tq; 
however, since T may map type or row variables other than 1, to types, S;T may not 
be defined. We thus consider instead the substitution T’, where dom(T’) = {7,}, and 
T'|tq] = T[7,], and note that since So is defined, so is $;7’. Thus, we can infer that 
FSR|SeC,SAD M:(S;T’o by (const). 

If the final step of the derivation is (coerce), then + R|C,A > M:o must have followed 
by the antecedents + R|C,AD M:y and Ct y Ca for some y. We note that by Lemma 
3.5.7, S is defined on y. Moreover, by Lemma 3.6.5, we have that SeC + ATOMIC(Sy C 
So); thus, by transitivity, 

CES Cse: 


Since R’|C’, A’ > M:S7 is an instance of R|C,A D M:7, and the proof of # R|C,A D 
M:7¥ is shorter than that of k R|C,A > M:o, we can assume that + R’|C’, A’ > M: Sy. 
By (coerce), we can infer that + R’|C’, A'D M:o’. 

If the final step of the derivation is (app), then M is of the form M = M'N’, and 
F R|C,A > M:o must have followed by the antecedents  Ry|C,A > M':y-0 and 
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+ Ry|C,A > N’:y for some y and R such that R = R; U {y}. Since by assumption S' is 
defined on R, we can assume that 


S is defined on y. 


It is worthwhile to note here that this is precisely the reason that restriction sets are 
needed in our typing system. Without restriction sets, it is impossible to guarantee that 
S is defined on y. 

Proceeding with the proof, we can now assume that R’|C’, A’ > M’':Sy—Sco is an 
instance of Ry|C,A D> M':y-0 and R'|C", A’ > N’: Sy is an instance of Ry|C,AD N':7. 
Since the proofs of # Ry |C,A D> M’:y-0 and + R,|C,A D N’:7¥ are shorter than that 
of F R|C,A D> M:o, we can assume inductively that F R’|C’,A’ D M’:Sy—So and 
+ R'|C', A! > N': Sy. Thus, since Sy € R’, we can infer by (app) that k R’|C’, A’ > M:o". 

If the final step of the derivation is (abs), then M is of the form M = fn P => M' 
and a is of the form ¢ = 01-02. Moreover, + R|C,A > M:o must have followed by 
the antecedents | R|C°?, Ap > P:o, and + R|C, A[Ap] > M’:o2 for some Ap such that 
vars(Ap) = vars(P). Since, by assumption, S is defined on o, we can assume that S' is 
defined on o; and o2. Thus, by Lemma 3.6.6, 


S is defined on Ap. 
Since A[Ap] = A, U Ap, where A; C A, 
S is defined on A[Ap]. 


We can thus assume that SR|SeC?,SAp D> P: So, is an instance of R| C°??, Ap D P: 04, 
and that SR|S eC,S(A[Ap]) > M': So is an instance of R|C,A[Ap] D M’:a2. Since 
the proofs of F R|C°?,Ap > P:o, and + R|C,A[Ap] D M’: a2 are shorter than that 
of + R|C,A D> M:o, we can assume inductively that | SR|SeC”?,SAp D P: So, and 
t SR|SeC,S(A[Ap]) > M’: So. By (abs), we can infer that + SR|SC,SA > fn P > 
M’': So,—So2. Using the results of Lemma 3.6.3, Lemma 3.6.2, and Lemma 3.6.1, we can 
conclude that F R’|C’, A’ D> M:0’. 

If the final step of the derivation is (rec3), then M is the form M = (f, E) and a is of 
the form o = (h, Z), where, for some hy and ha, h = hy + hg and dom(h1) N dom(h2) = ¢. 
Moreover, + R|C,A D> M:o must have followed by the antecedents dom(f) = dom(h,) # 
g,F RIC,A D (¢, E): (he, Z), and VI € dom(f). - R|C,A D fil): Ai (0D). Since, by 
assumption, S' is defined on o, we can assume that VI € dom(h). S(h(1)) is defined and 
that dom(h) N dom(le ft(S(¢, Z))) = ¢. Since hg C h, we can assume that S(h2, Z) is 
defined as well. Thus, we can assume inductively that + R’|C’, A’ D (¢, E): S(ha, Z), and 
Vl € dom(f). + R'|C’, A’ Dd f(): S(hi(1)). It remains to show that the antecedents needed 
to derive F R’|C’, A’ D (f, FE): S(h, Z) by (rec3) are true. 

We first note that S(ho,Z) = (ho; 5 + left($(¢, Z)), right(S(¢,calgZ))). Since + is 


associative, we have that 
S(h, Z) == ((hi + ha); S + le ft(S(¢, calgZ)), right(5(¢, Z))) 
which is equal to 
(hi; S + (ho; S + left($(¢, calgZ))), right(S(¢, calgZ))). 
Thus, we can infer by (rec3) that  R’|C’, A’ D (f, E): 0’, which proves the theorem. rT 
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3.7 Most General Typings 


In this section, we discuss the notion of most general typings in the context of the R-typing 
system. We extend the definition of most general typings from [Mit84] a bit and say that a 
restricted typing R|C,A D> M:c is a most general restricted typing for M iff the set of its 
instances is exactly the set of provable restricted typings for M. 

It is worth noting here that most general restricted typings in our system are essentially 
unique in the sense that that they form a natural equivalence class. This is also true of ML, 
where the equivalence relation is that all most general typings for a given expression are 
unique up to variable renaming. Interestingly, in ML, two typing statements are unique up 
to variable renaming iff they are instances of each other. With the introduction of subtyping 
assertions, however, the notions of instance and variable renaming are no longer equivalent, 
and it turns out that the correct equivalence relation for this system is that all most general 
restricted typings for a given expression are instances of one another. 

For example, 

ol{sCt},@>.fineg >2:st 


and 
o|{acb,cCd},@>.fnz>z:c-d 


are both most general restricted typings of the identity function, since every typing of 
fn xz > z is an instance of both of them. These restricted typings are not unique up to 
variable renaming, but they are instances of one another. 

It is useful to point out here that the presence of an equivalence class of most general 
restricted typings versus a single most general restricted typing does not affect the decid- 
ability of the type inference problem. As we will prove in the next chapter, the algorithm, 
given a typable expression in the language, yields a most general typing for that expression. 
It does not matter which most general restricted typing is yielded, since it is possible to 
check whether one restricted typing is an instance of another. 


Lemma 3.7.1 For any restricted typing R|C, A> M:o, the set of instances of R|C,AD 
M:o is recursive. 


We omit the proof here, stating only that the obvious algorithm takes time exponential 
in the size of the restricted typing. The difficulty in this algorithm seems to lie in checking 
the instance conditions on the R and C, and we haven’t as yet had a chance to determine 
whether a polynomial time algorithm exists. We do know that if we assume that all ex- 
pressions in MLt are renamed (i.e., alpha-converted) so that all lambda-bound variables 
and extension variables in the expression are distinct, and we associate the corresponding 
expressions with the cut types in R, then checking the instance condition on this augmented 
set R can be done in polynomial time. It is unclear to us at the present time whether the 
condition on C’ can be checked in polynomial time; however, we haven’t as yet had a chance 
to think about this issue in much detail. 

We note here that, as a consequence of Lemma 3.7.1, the decision problem for determin- 
ing whether a restricted typing for an expression M is provable is reducible to the problem 
of computing a most general typings for M. 
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3.8 Relating the R-Typing System to the Original System 


It may seem peculiar to the reader that we first developed a typing system that captured 
the form of subtyping discussed in Chapter 1, proceeded to discuss the technical difficulties 
in the typing system, and then modified the typing system so that it has the technical 
properties that we desire. The obvious question arises as to which typing system — the 
original typing system or the R-typing system — is the “real” typing system that defines 
MLt? 

It turns out that the original typing system and the R-typing system are, in fact, inti- 
mately related. More precisely, 


Lemma 3.8.1 A typing C,A D> M:c is provable in the original typing system iff there 
exists a set R such that R|C,A > M:o is provable in the R-typing system. 


Proof. To prove one direction, suppose that + C,A D> M:o, and let R be the set of 
“cut” types in the derivation of C, A D> M:a. Then there is a derivation of R|C,A> M:a 
that uses the sequence of typing rules in the R-typing system that correspond exactly to 
the sequence of typing rules used in the derivation of C, A D> M:o. 

To prove the other direction, suppose that  R|C,A > M:o for some set R. Then 
+ C,AD M:o by a derivation that uses the sequence of typing rules in the original typing 
system that correspond exactly to the sequence of typing rules used in the derivation of 
ER|C,AD M:o. 5 

Furthermore, we can state the following property of the original typing system that is 
very similar to Theorem 3.6.7. 


Lemma 3.8.2 Suppose’ C,AD M:c. If C’, A’ > M:o' is an instance of C, A D M:a by 
a substitution that is defined on the set of cut types in the derivation of C,A D M:a, then 
EC’, A'D> M:0"'. 


The proof is very similar to that of Theorem 3.6.7. 

Using this restricted notion of instance, we could formulate a definition of most general 
typings for the original typing system that “implicitly” referred to the set of cut types in the 
derivation, and proceed to show that the algorithm is “equivalent” to the original typing 
system. However, the R-typing system provides a nice syntactic mechanism to explicitly 
keep track of these cut types, and therefore we will prove in Chapter 4 that the algorithm 
is “equivalent” to the R-typing system. However, using the above lemmas as justification, 
we consider the original typing system to be the typing system that defines ML?. 
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Chapter 4 


The Type Inference Algorithm 


4.1 Overview 


In this chapter, we present the type inference algorithm, GE, for ML*. We develop Theo- 
rems 4.5.3 and 4.6.4, the main theorems of this thesis, which show that, for every expression 
M of ML*, GE derives a most general restricted typing statement for M if M is typable in 
the R-typing system of Chapter 3 and “fails” otherwise. More specifically, Theorem 4.5.3 
shows that GE is sound with respect to the R-typing system; that is, for any expression 
M, if the algorithm succeeds with a restricted typing statement, then every instance of the 
typing is provable. Conversely, Theorem 4.6.4 shows that GE is complete with respect to 
the R-typing system; that is, for any expression M, if there is a provable restricted typing 
for M, then the algorithm will succeed with a restricted typing statement of which the 
provable typing is an instance. 

This chapter is organized as follows. Sections 4.2 and 4.3 develop two algorithms, UNIFY 
and MATCH, respectively, that form the foundation of the type inference algorithm. The 
algorithm UNIFY for unification is used to combine most general restricted typings of 
subexpressions, while the algorithm MATCH for matching is used to guarantee that the 
set of subtype assertions in a most general restricted typing is atomic. The type inference 
algorithm, GE, is presented in Section 4.4, while Sections 4.5 and 4.6 prove that GE is 
sound and complete with respect to the R-typing system. Finally, Section 4.7 relates the 
type inference algorithm to the original typing system. 


4.2 Unification 


As in ML, we will use unification to combine (restricted) typing statements about subex- 
pressions in the process of inferring a typing for an expression in MLt. The standard notion 
of unification is to produce a substitution that makes a pair of type expressions syntactically 
identical. However, because the first component of our record types is a function, we need 
to use a slightly modified definition of syntactic identity. To this end, we define syntactic 
likeness between type expressions as follows: 


e by is syntactically like bz if bj and bg are the same ground constant 
e g is syntactically like go if q; and q2 are the same built-in constant 


e zx is syntactically like x 
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e oT is syntactically like o’—1’ if o is syntactically like o’ and 7 is syntactically like 


7! 


e (h, Z) is syntactically like (h’, Z) if dom(h) = dom(h’) 
and VI € dom(h). h(1) is syntactically like h’(1) 


We say that a substitution S unifies a set E of equations between type expressions, or 
equivalently, that S is a unifying substitution for E, if, for all equations 0 = T € E, S is 
defined on o and 7, and So is syntactically like S'r. 

In the process of computing most general typings for expressions in MLt, we would 
like to compute “most general unifiers” in order to combine the most general typings of 
subexpressions. Following [Rob65], we would like to define a substitution S that unifies a 
set EF of equations between type expressions as a most general unifier for F if S unifies E 
and, for all substitutions T that unify FE, there exists a substitution U such that 5; U is 
defined and T = S;U. However, due to extended record types, it turns out that we will 
need to modify this definition somewhat. 

We illustrate the difficulty with the standard definition through an example. Consider 


the equation 
{a:int; ¥} = {b: bool; X’} 


Clearly, any substitution that unifies this equation must map 4’ to a record type that 
contains at least an a:int field and must map ¥ to a record type that contains at least 
a b: bool field. Furthermore, the “rest” of ¥ and A’ must be identical. Thus, the most 
general substitution S that unifies this equation is 


S = [{b: bool; Y}/¥, {a: int; Y}/¥, 


for some “fresh” \¥, where ) is a row variable that does not occur anywhere in the set of 
equations that we are unifying. 

It turns out that the difficulty with the standard definition lies precisely in the intro- 
duction of such fresh row variables. Consider the substitution 


T = [{b: bool; NULL}/4X, {a:int; NULL}/4’, {a: real; NULL}/Y] 


Clearly, J’ unifies the above equation. However, since any substitution U such that T = $;U 
must map Y to {NULL}, the action of S;U on ) differs from that of T. Therefore, there 
is no such substitution U, and thus, S$ cannot be the most general unifier for E. 

In order to overcome this difficulty, we introduce a “restricted” form of equality over 
substitutions. Let Q be a set of type variables and row variables. We say that S$ is equal 
to T with respect to Q, written S=g T,if Vie QO. St =Ttand VV € O. SX =TX. We 
also are more precise in our notion of “fresh” variables, and say that a row variable 1 is 
fresh with respect to Q if ¥ ¢ Q. An analogous definition, which we will need in the next 
section, holds for type variables. 

Using these definitions, we restrict our notion of most general unifiers as follows. Let FE 
be any set of equations between type equations, and let Q again be any set of type variables 
and row variables. We say that a substitution S is a most general unifying substitution for 
E with respect to Q if S unifies F and, for all substitutions T that unify FE, there exists a 
substitution U such that $;U is defined and T =gug, 5;U, where Qzg is the set of type 
and row variables occurring in EF. The idea behind this restriction is that, while computing 
S', we choose the “fresh” row variables to be fresh with respect to QU Qg. If we ensure 
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that Q is a co-infinite set, then we will always be able to choose a “fresh” row variable, thus 
getting the behavior we desire. (Recall that we assume an infinite set of type variables and 
row variables in our system.) 

Using these ideas, our algorithm UNIFY takes a set of equations k between type ex- 
pressions and a co-infinite set Q of type variables and row variables and produces a most 
general unifying substitution for E with respect to QU Qg, where Qzs is the set of type 
variables and row variables that occur in E. If no unifying substitution for F exists, then 
UNIFY fails. UNIFY is a modification of Robinson’s unification [Rob65], extended to solve 
equations between extended record types, and is quite similar to the unification algorithm 
described in [Wan87]. However, it corrects a bug in Wand’s algorithm for unifying two 
extended record types; we postpone discussion of this bug to the proof of Lemma 4.2.1. 

Quite importantly, another subtle bug in Wand’s algorithm is avoided by the syntactic 
restriction in ML* prohibiting duplicate field names within a record. Because the language 
of [Wan87] does not make this syntactic restriction, the algorithm given in [Wan87] for 
equations between row expressions is incorrect for his system; however, it is correct for our 
system. His corrected algorithm, which appears in [Wand88], needs to compute a set of 
most general unifiers, while our algorithm computes a single most general unifier. 


Lemma 4.2.1 Let E be a set of equations of the form o = 7 and let Q be a co-infinite set 
of type variables and row variables. Let Qg be the set of type variables and row variables 
that occur in E. There exists an algorithm UNIFY such that whenever there is a unifying 
substitution for E, UNIFY(E,Q) produces a most general unifying substitution with respect 
to OU Qzr. Otherwise, UNIFY(E, Q) fails. 


The proof appears in the Appendix. 


4.3. Matching 


In the process of computing a most general restricted typing for an expression M, we 
will use unification to combine the most general restricted typings of the subexpressions 
of M. However, the most general unifier may not respect the sets of subtyping assertions 
in these typing statements, thus violating our requirement that typing statements contain 
only atomic subtype assertions. Therefore, we need to be able to compute a “most general 
matching substitution” for this set of possibly non-matching subtype assertions, so that we 
can apply the operation e developed in Chapter 3 to get a “most general” set of atomic 
subtype assertions. 

We would like to follow the standard notion of most general substitutions and say that a 
substitution S is a most general matching substitution for C if S is a matching substitution 
for C and, for all matching substitutions T for C, there exists a substitution U such that 
S;U is defined and T = S;U. However, a problem arises due to fresh type and row 
variables which is similar to the one of Section 4.2, and so we restrict our notion of “most 
general” in a similar manner. Let C' be a set of possibly non-matching subtype assertions 
and let Q be a set of type variables and row variables. We say that a substitution S$ is a 
most general matching substitution for C' with respect to Q if S is a matching substitution 
for C' and, for all matching substitutions T for C, there exists a substitution U such that 
S;U is defined and T =gug, $;U, where Qc is the set of type and row variables occurring 
in C. Again, the idea behind this restriction is that the fresh type and row variables are 
chosen from the complement of QU Qc. 
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An extremely useful property of matching is that the problem of finding a most gen- 
eral matching substitution for a set C of possibly non-matching subtype assertions can be 
reduced to the problem of finding a most general unifying substitution for a set E that 
is closely related to the set C. The basic idea here is that we treat the set C as a set of 
equations between types (rather than inequalities) and unify this set of equations. There 
is a minor complication, however, due to base types, since two distinct base types match 
but do not unify. In order to get around this difficulty, we replace all base types (and 
NULL) in C by special type variables (and row variable), and check that the most general 
unifier merely renames these special type variables and row variable. We then derive the 
most general matching substitution from this most general unifier S by “factoring” S into 
a “most general” substitution S$; composed with a “simple” substitution $2 such that all 
type variables and row variables occurring in the range of S; are distinct and such that $2 
merely renames type variables and row variables. If we incorporate the set Q appropriately 
into the above procedures, we can use the properties of most general unifiers to show that 
the substitution S; is the most general matching substitution for C with respect to Q. 

Using these ideas, our algorithm MATCH takes a set C’ of possibly non-matching subtype 
assertions and a co-infinite set Q of type variables and row variables and produces a most 
general matching substitution for C with respect to QU Qc, where Qc is the set of type 
variables and row variables occurring in C’. If no matching substitution for C exists, then 
MATCH fails. Our algorithm is quite similar to the algorithm of [Mit84], extended to 
support base types and record types. 

This section is organized as follows. We will first show in more detail how the problem 
of finding matching substitutions reduces to the problem of finding unifying substitutions. 
Being careful about the set Q, we will then precisely define what it means for all type and 
row variables in the range of a substitution to be distinct, and then show how to “factor” 
any substitution S into two substitutions S; and S2 with the properties outlined above. 
Finally, using these results, we will present the actual algorithm MATCH and give the 
proof of its correctness. 

In order to relate matching and unification, we first need to define precisely what it 
means for a substitution to merely rename type variables and row variables. Formally, we 
say that a substitution S is simple if, for all t € dom(S), there exists a type variable t’ such 
that St = ¢t’, and for all VY € dom(S), there exists a row variable 4” such that SY = (¢, ¥’). 
Similarly, a c-simple substitution corresponds to either renaming type variables and row 
variables or replacing them by base types or NULL, respectively. Formally, we say that a 
substitution S is c-simple if, for all t € dom(S), there exists either a type variable ¢’ such 
that St = t’ or a base type ¢ such that St = c, and for all Y € dom(S), either there exists 
a row variable 4” such that SY = (¢,¥') or SX = (6, NULL). It is easy to show that if 
T is a simple substitution or a c-simple substitution, then, for any substitution $, S;T is 
defined. 

Just to keep our definitions explicit, we formally define the range of any substitution S$ 
to be the set of type variables and row variables that occur in S[s] or S[¥] for some s,¥ € 
dom(S). 

We now show precisely how the problem of finding a matching substitution for a set C 
(with respect to a set Q) reduces to the problem of finding a unifying substitution for a 
related set & (with respect to Q). 


Lemma 4.3.1 LetC be a set of subtype assertions 0 C rT, where o and r may not necessar- 
ily match, and let Q be a finite set of type variables and row variables. Let Qc be the set of 
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type variables and row variables that occur in C, and let tint, treat, tboot, tstring, and YNULL 
be fresh with respect to QU Qc. Let C’ be the set C with all occurrences of int replaced 
with tint, all occurrences of real replaced with treqi, all occurrences of bool replaced with 
thoot, all occurrences of string replaced with tstring and all occurrences of NULL replaced 
with Xnutyt. Let FE = {o’ =7 |e Cre GC} 

T is a matching substitution for C iff there exist substitutions T,, Tz, and T3, such that 
T2 is c-simple, T3 is simple, 


T = QuQ¢ T1;T2, and T;';T3 unifies E, 


where T;’ = T, (QU Qc), 
and, for alli € {int, real, bool, string}, 


1. (T%';T3)t; # o-7, for any o,T 
2. (Ty; T3)ti # (h, Z), for any h,Z 
and (Ty';T3)¥nuzy # (h, Z), for any h # ¢. 


Proof. To prove one direction, suppose that T is a matching substitution for C’. Let 
Qo = Qc U {ti | t€ {int, real, bool, string}} U {Xnu_}- Let Sint, Sreal, Sbool» Sstring> 
and Ynuxz be fresh with respect to QU Qo. 

Let T, be a substitution whose domain is (Q U Qc). Furthermore, 7; maps all type 
variables ¢ in its domain to Tt with all occurrences of int replaced with 3;,;, all occurrences 
of real replaced with 8,¢q;, all occurrences of bool replaced with s,,,;, all occurrences of 
string replaced with Sstring and all occurrences of NULL replaced with YVyyxz. (The action 
of T, on row variables in its domain is analogous.) 

Let T2 = [int/sjnz¢, real/s;eqi, Dool/s 4001, string/Sstring, (6, NULL)/Ynutz]. Clearly, 
71; 72 is defined, and 

£ =QUQc T1; T2. 


It is easy to see that T;’ = T;{(QU Qc) = T;. Since T respects C and T> is c-simple, it 
follows that T, respects C’ as well. 

Let 73 be the substitution whose domain is QU Qc U range(T;) and such that T3 
maps all type variables in its domain to some type variable ¢’ and maps all row variables 
in its domain to (¢, 4’), for some row variable +’. Clearly, 73 is simple, and so 7;; 73 is 
defined. Furthermore, since the t; and Ayypzy are fresh with respect to QU Qc, we have 
that 7,t; = ¢; for all the t;, and that Ty¥nuxtzt = (¢,¥nuzz). Thus, (11;7T3)t; = t’ and 
(11; T3)¥nuy = (¢, 4’), satisfying conditions (1), (2) and (3) above. 

Since 7\C is matching, we have that for all o C r € C, Tio and T,7 differ only in the 
names of type variables and/or ground types and in the names on row variables and/or 
NULL. Thus, by the construction of T; and C”’, we have that for all 0’ C r’ € C’, Tyo’ and 
Tir’ differ only in the names of type variables and in the names of row variables. Thus, 
T,;T3 unifies FE, proving one direction. 

To prove the converse, we note that 73 is simple implies that for all o’ C 7’ € C’, Ty'o’ 
matches T;’r’. Since T;’t; = t; for all the t;, and T;/Xnuzz = (¢, YNULL), it follows that T;’ 
is a matching substitution for C. Furthermore, by the definition of T;’ and Qc, it follows 
that 7 is a matching substitution for C. 

Since T> is c-simple, 7,;7> is also a matching substitution for C. Then, clearly, so is 
(71; T2)t(QU Qc), and thus so is T[(QU Qc). By the definition of Qc, T is a matching 
substitution for C, proving the lemma. 2 


39 


In order to show how to “factor” substitutions, we state precisely what it means for all 
the type and row variables occurring in the range of the most general unifier to be distinct. 
Since we do not want the these type and row variables to conflict with the set Q in our 
notion of most general matching substitutions, we incorporate Q into our definition. 


Definition 4.3.2 A substitution S chooses variables freely on a set Q of type variables and 
row variables if no type variable or row variable in Q occurs in range(S') and 


e For each t € Q, no type variable or row variable appears twice in St. 

e For each X € Q, no type variable or row variable appears twice in SX. 

e For any distinct s,t € Q, no type variable or row variable in Ss appears in St. 

e For any distinct X,Y € Q, no type variable or row variable in SX appears in SY. 


e For any t,¥ € Q, no type variable or row variable in St appears in SX. 


In the proof of correctness for the algorithm MATCH, we will need to use the prop- 
erty that choosing variables freely is preserved under composition of appropriately defined 
substitutions. 


Lemma 4.3.3 Suppose that a substitution S chooses variables freely on a set Q of type 
variables and row variables, and that a substitution T chooses variables freely on the set of 
all type variables and row variables occurring in all St or SX witht € Q, X €Q. If S;T 
is defined, then S;T chooses variables freely on Q. 


The proof is straightforward and we omit it. 

As alluded to previously, we will need to be able to “factor” a substitution $ into 
a substitution S, that chooses variables freely on on a set Q of type and row variables 
composed with a simple substitution $2. To prove this property, we need to develop a 
linear ordering on the “components” of a type. First of all, we consider a record type (h, Z) 
to be written out as A(l,), A(l2), ...A(n), Z, where dom(h) = U,e;<, and the J; are in 
lexicographic order. (The other types are written out in the expected manner.) Assuming 
this order, we can unambiguously refer to the m-th type variable occurring in a type, or 
similarly, the m-th row variable occurring in a type. Assuming that each occurrence of a 
type variable, row variable, base type, and NULL determines a “position” in a type, we can 
also unambiguously refer to the k-th position in a type. 


Lemma 4.3.4 Let Q' be a finite set of type variables and Q" be a finite set of row variables, 
and let Q = Q'UQ". For any substitution S, there are substitutions S, and Sz such that 
S; and Sz are computable, 5, chooses variables freely on Q, Sq is simple, and S =g $1; So. 


Proof. The proof is analogous to that of [Mit84]. To define S$; and So, let t,, te, ... 
be an enumeration of Q¢ and let 4, ¥2,... be an enumeration of Q”, and let us partition 
the complement of Q' into disjoint sets 24, U2, ... and partition the complement of Q” into 
disjoint sets W , Wo, .... It will be convenient to choose some enumeration of each U;, say 
U; = {ti1, ti, ...} and some enumeration of each W;, say W; = {Xj 1, ¥j,2, -}- 

For each ¢; € Q, let Sit; be the type expression derived by replacing the m-th type 
variable occurrence in St; with the m-th type variable tz;m from U2; and by replacing the 
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n-th row variable occurrence in St; with the n-th row variable ¥2;,, from W2;. For each 
X; € Q", let 51; be the type expression derived by replacing the m-th type variable 
occurrence in SA; with the m-th type variable to;41,m from U%2;41 and by replacing the 
n-th row variable occurrence in $4; with the n-th row variable X2j;41.n from W2;41. Let 
S_ map tim back to the m-th type variable in St;, taj41,m back to the m-th type variable 
in SX;, X2im back to the n-th type variable in St;, and %2;41,n back to the n-th type 
variable in S4;. Clearly, S'2 is simple, and thus $1; 2 is defined. It is easily verified that 
S, chooses variables freely on Q and that S =g 5\;S2. Moreover, all type expressions are 
of finite length, the complements of Q* and Q” do not need to be fully partitioned or fully 
enumerated. Thus, 5, and 52 are computable by the procedure outlined above. rT 

To show that the substitution that our MATCH algorithm computes is the most gen- 
eral matching substitution, we will need to use the fact that any substitution 5; with the 
properties outlined above is “most general” in the following precise sense. 


Lemma 4.3.5 Let Q be a finite set of type variables and row variables. Furthermore, let S 
be any substitution, and let S, and Sz be substitutions such that S, chooses variables freely 
on Q, Sz is simple, and S =g $1; So. If S =g T1;T2 for some simple substitution T2, then 
there exists a substitution W such that 5\;W is defined and T; =g 51;W. 


Proof. Since Sz and T> are simple, it follows that, for any t € Q, the type expressions 
St, S\t, and T\t must match. Similarly, for any VY € Q, the type expressions S17’, 5,4’, and 
T,* must match. 

For any t € Q, consider the k-th position of S,;t and the k-th position Tyt. By the 
definition of “simple”, it follows that either both positions contain type variables, both 
positions contain row variables, or both positions contain the same ground type. In the first 
case, since Sj chooses variables freely on Q, there is a well-defined substitution W mapping 
the type variable occurring in the k-th position of S,t to the type variable occurring in 
the k-th position of Tt. Similarly, in the second case, since S$, chooses variables freely on 
Q, there is a well-defined substitution W mapping the row variable occurring in the k-th 
position of S;t to (¢, Y), where ) is the row variable occurring in the k-th position of Tyt. 
(An analogous statement holds for any ¥ € Q.) 

Clearly, W is simple, and thus 5; W is defined. It is to verify that T; =g 51; W, proving 
the lemma. 2 

Having developed the necessary technical machinery, we now prove the main lemma 
of this section; namely, that there exists an algorithm MATCH that, given a set C of 
possibly non-matching subtype assertions and a set Q of type and row variables, computes 
a most general substitution for C’ with respect to Q. We also prove that the domain of 
MATCH(C, Q) is subset of QU Qc, where Qg is the set of type variables and row variables 
that occur in C, since, for technical reasons, we will need to use this fact in our proof of 
completeness of GE in Section 4.6.4. 


Lemma 4.3.6 Let C be a set of subtype assertions o Cr, where o and t may not neces- 
sarily match, and let Q be a finite set of type variables and row variables. Let Qc be the 
set of type variables and row variables that occur in C’. There is an algorithm MATCH 
such that if there is a matching substitution for C, MATCH(C, Q) produces a most general 
matching substitution with respect to QU Qc. Otherwise, MATCH(C, Q) fails. 
Furthermore, if MATCH (C, Q) succeeds, then dom( MATCH(C, Q)) C (QU Qc). 


Proof. The algorithm is as follows: 


Al 


MATCH(C, Q)= 
let Qc be the set of type variables and row variables that occur in C 
tints trealy thools tstring, and Xnuxr be “fresh” with respect to QU Qc 
C’ be the set C with 
all occurrences of int replaced with tint, 
all occurrences of real replaced with trea, 
all occurrences of bool replaced with tjoo7, 
all occurrences of string replaced with tstring, and 
all occurrences of NULL replaced with Ynu_zt 
Beato =r" |e Gr €c't 
Qc = Ac Uf{t; | i € {int, real, bool, string}} U {¥nux} 
S = UNIFY(E, QU Qc") 
if, for any t; such that i € {int, real, bool, string}, 
St; = o-—7 for any 0,7 or 
St; = (h, Z) for any h, Z or 
SXNULL = (h, Z) for any h x 0) 
then fail 
else let S and S_ be the substitutions such that 
S =Quea_, 51; $2 
and 5; chooses variables freely on QU Qo: 
and $4 is simple 
Sy’ = Sif(QU Qc) 


return S,’ 


We first prove that if MATCH(C, Q) succeeds, then there is a matching substitution for 
C. More specifically, we show that MATCH(C, Q) produces a matching substitution for C. 
Suppose that MATCH(C, Q) succeeds. By Lemma 4.2.1, S is the most general unifier for 
E with respect to QU Qc. Furthermore, S maps all the t; to types that are not function 
types or record types and maps Yyypp to a record with ¢ as its first component. Since E 
does not contain NULL or any base types and S is a most general unifier for E (w.r.t to 
QU Qc-), we have that for all o’ = r' € E, both So’ and Sr’ do not contain NULL or any 
base types. 

Let SY = Sy [(QU Qc). Since $2 is simple, and by the properties of S mentioned above, 
it follows that for all the t,;, there exists a type variable s; such that $j/t; = s;. Also, by the 
same reasoning, S//VnuLi = (¢, X’), for some 4’. Let 


S3 _ [Sint litt Boat pichals Shoot} thats Siting facings (¢, X'\/XnuLL] 
Since 5S; chooses variables freely on QU Qc, we have that 
51'5 53 =Quag, St 


Thus, since FE contains only type and row variables from Qc, we have that $1’; 53; S2 unifies 
E. Noting that 53; S2 is simple, we can use Lemma 4.3.1 to prove that S$ is a matching 
substitution for C. Then, clearly, S$,’ must be a matching substitution for C. 

To prove the other direction, suppose that there is a matching substitution T for C. 
By Lemma 4.3.1, there exist substitutions 7), T2, and T3 such that T =gug, 11;To, To 
is c-simple, 73 is simple, and such that 7;';73 unifies EF, where Tj’ = 7T;}(QU Qc). By 
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Lemma 4.2.1, we can conclude that 5S = UNIFY(E, QU Qc-) succeeds, and that there exists 
a substitution U such that $;U is defined and such that S;U =gug,, T,';T3. Furthermore, 
by Lemma 4.3.1, T,'; 73 maps all the t; to types that are not function types or record types 
and maps A’nuzbz to a record with ¢ as its first component. Thus, this property must also 
hold true for §. By Lemma 4.3.4, it follows that S$; and S2 exist and are computable; thus, 
MATCH(C, Q) succeeds. By the proof above, $;’ is a matching substitution for C’. 

It remains to show that there is a substitution W such that $,’;W is defined and 
T =Qud¢ $1';W. We first note that S =gug, S1';S2. We let S’ = S(QU Qc), and let 
Ul = UNQU Qc U range(S’)). Since Qc C Qcr, we have that 51';$2;U’ is defined and 
that 

S30’ =Quee $1'; $2; U' =que, T1';Ts 


By Lemma 4.3.4, we can “factor” $2; U' into U,; U2 such that 


S23 U’ = QUQcUrange(S') Ui; U2 


and such that U; chooses variables freely on QU Qc U range(S;’) and U2 is simple. Since 
51’; Sp is defined and 52; U’ =,ange(s,') U1; U2, it follows that 51’; U1 is defined as well. By 
Lemma 4.3.3, we have that $,'; U, chooses variables freely on QU Qc. Moreover, 


51"; $23 U" =Qua¢ 51’; U1; Va 
By Lemma 4.3.5, there exists a simple substitution V such that 
$1301; V =Quee Ti’ 
Thus, 51’;U1;V =que, Ti. Putting it all together, we have that 
51’; U1; V; Tz =Quag T1;T2 =Quae T 


Letting W = U,;V;T2 proves this direction of the lemma. 
To finish the proof of the lemma, we note that if MATCH(C, Q) succeeds, then, clearly, 
dom(S') C (QU Qc). . 


4.4 The Type Inference Algorithm 


This section presents the type inference algorithm, GE, for MLt. GE is written in an 
applicative, pattern-matching style, and is defined by the mutually recursive clauses given 
in Table 4.1. 

The algorithms UNIFY and MATCH play a crucial role in GE, while a subsidiary role is 
played by algorithms for applying substitutions, composing substitutions, subtracting two 
type environments, computing the union of two type environments, and pasting two func- 
tions mapping labels to types. All of these subsidiary algorithms are assumed to fail when 
the desired result is not well-defined. For example, while we write So for the application of 
a substitution S to type expression o, we assume in our algorithm GE that if the result is 
undefined (not a well-formed expression), then the entire algorithm will terminate in error. 
This notational convention makes the pseudo-code in Table 4.1 more readable. 
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GE(b) = @|{c C t},0 > b:t, where c is the type of b 


GE(q) = 
let Q = the type and row variables in 7,, | where 7, is the built-in type for ¢ 
T = MATCH({r, C t}, QU {t}), where t is fresh with respect to Q 
in@|Te{r, Ct}, ODq:Tt 


GE(z) = @| {s C t}, {2:8} D ait 
GE((¢, EMPTY)) = 0|{(¢, NULL) C (6, ¥)},0D (¢, EMPTY): (¢, X) 
GE((¢, u)) = O| {(¢, ¥) C (6, V)}, {uz (b, ¥)} D (d, u): (6, Y) 


GE((f, £)) = (where f is non-empty) 
let Ry |Ci, Ar D f()): 01 = GE(f(1)), for all | € dom(f) 
with the type and row variables in GE(f(J)) renamed to be distinct from those in GE(f(I’)) 
for all 1 £ l’ where 1,1’ € dom(f) 
Reg | Cg, Ag D (¢, E): (hg, Z) = GE((¢, E)) 
with the type and row variables in GE((¢, F)) renamed to be distinct 
from those in GE(f(l)), for all 1 € dom(f) 
Q; = the type and row variables in R;| Cj, Ar D f(l): 01, for all 1 € dom(f) 
Or = the type and row variables in Rg|Cg, Ag D (¢, E): (he, Z) 
T = UNIFY({a = 8 | w:a€ Ay and w:8 € Ap for! #1’, 1,’ € dom(f) 
or w:a € A; and w: 8 € Ag for 1 € dom(f)}, Qe U Uledom(f) Q:)) 
S = T;MATCH(TCE U Uledom(f) TC), range(T)U OfU Uiedom(f) Q1)) 
in SRE U Uledom(f) SR | Se CE U Utedom(f) Se Ci, SAE U Uiedom(f) SA > (f, E): S(h+ hg, Z) 
where dom(h) = dom(f) and VI € dom(h). h(l) = a 


GE(MN) = 
let Ry|Ci, Ay ) M:o= GE(M) 
Ry| C2, Ag > N:ir= GE(N) 
with type and row variables renamed to be distinct from those in GE(M) 
Q, = the type and row variables in Ry|C,, A, > M:o 
Q2 = the type and row variables in R2|C2, Ag D N:t 
T = UNIFY({a = | w:a€ A, and w:8 € Ag} U{o =7 >t}, Q) U Qo) 
where ¢ is fresh with respect to Q; U Qo 
S = T;MATCH(TC, UTC), range(T) U Q; U O2 U {t}) 
in SR, U SR, U{ST}| S$ eC, US © Cy, $A, U SAz D> MN: St 


GE(fn P> M) = 
let Rp| Cp, Ap D P:o = GE(P) 
R.|Ce, Ae D M:r = GE(M) 
with type and row variables renamed to be distinct from those in GE(P) 
Q, = the type and row variables in R,|C,, Ap D P:¢ 
Q, = the type and row variables in R,|C., Ae D M:r 
T = UNIFY({a = 6 | w:a€ A, and w:8 € Ae}, Q, U Qe) 
S = T;MATCH(TC, UTC, range(T) U Q, U Q,) 
in SR,USR.|(SeC,)?US eC.,(SAe- SAp) D fn P > M:S(o—> 1) 
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4.5 Soundness 


In this section, we prove Theorem 4.5.3, the first of the two main theorems of the thesis, 
which shows that GE is sound with respect to the R-typing system and generates the most 
general restricted typing. More precisely, it shows that for any expression M, if GE(M) 
yields a restricted typing statement, then every instance of that restricted typing is derivable 
in the R-typing system. 

We will prove Theorem 4.5.3 in the following manner. We will first prove Lemma 4.5.2, 
which shows that, for any expression M, if GE(M) succeeds and yields a restricted typing, 
then that restricted typing is provable in the R-typing system. Then, by Theorem 3.6.7, it 
follows easily that every instance of that restricted typing is provable. 

In order to prove Theorem 4.5.2, we will need to use the following fact about the action 
of GE on patterns: namely, that it infers a type environment that contains mappings for 
exactly the free variables of the pattern. 


Lemma 4.5.1 If GE(P) = R|C,AD P:o, then vars(A) = vars(P). 


We omit the proof, as it follows by a simple induction on the structure of terms. 

Using some of the lemmas developed in Chapter 3 as well as the above lemma, we 
now prove the desired lemma about the algorithm. In our proof, we extend our notion of 
unification to type environments and say informally that “S unifies A; and A,” if, for all 
w such that w:a € A; and w: 8 € A» for some a and f, Sa is syntactically like Sf. 


Lemma 4.5.2 If GE(M) succeeds and yields R|C,A > M:o, then R|C,AD M:o is a 
provable restricted typing statement. 


Proof. The proof is analogous to [Mit84]. We proceed by induction on the structure 
of terms. 

If M is an integer 6, then GE(M) = 0| {int C t},@ D 6:t. By axiom (int), we have that 
+ §|0,0 > b:int. Lemma 3.6.2 yields + @| {int C ¢},@ > b: int, and a use of (coerce) then 
yields + Q| {int € t},@ D 6:t, as desired. The proof is analogous if 6 is a real, a boolean, or 
a string. 

If M is a variable z, then GE(M) = @|{s C t},{z:s} D 2:t. By axiom (var), we 
have that + 0| {z:s},@ > 2:s. Lemma 3.6.2 yields + 0| {s C t}, {z:s} D 2:5, and a use of 
(coerce) then yields + @| {s C t}, {z: s} D 2: t, as desired. 

If M is of the form (¢, EMPTY), then GE(M) = 0|{(¢,NULL) C (¢,X)},0 dD 
(¢, EM PTY): (¢, ¥), for some ¥. We have that + 0|0,0 > (¢, EMPTY): (¢, NULL) by 
axiom (rec1). We have that + 0|{(¢, NULL) C (4, ¥)},0D (¢, EMPTY): (¢, NULL) by 
Lemma 3.6.2, and we get F 0| {(¢, NULL) C (¢,¥)},0D (¢, EMPTY): (¢, ¥) by a use of 
(coerce), as desired. 

If M is of the form (¢,u), then GE(M) = 0|{(¢,¥) C (6,V},0 D (¢,u): (d, ¥). 
By axiom (rec2), we have that + 0|0, {(¢, u): (6,4), D (6, u): (6, V). Lemma 3.6.2 yields 
F O|{(d, ¥) C (b,V)},0 D (6, u): (6, ¥), and a use of (coerce) then yields F 0|{(¢, ¥) C 
(¢, ¥)},0D (¢, u): (6, Y), as desired. 

If M is a built-in constant q, then GE(M) = 0|Te{r, C t},0 D 6: Tt, for some t. By the 
properties of MATCH, T is a matching substitution for {r, C r}. Thus, by axiom (const), 
we have that F 0|0,0 5 q:T7,. Lemma 3.6.2 yields | 0|Te{r, C t},0D q:TT,. By Lemma 
3.5.5, Te {ty Ct} Tr, € Tt. Thus, a use of (coerce) yields + | Te {7, C t},0 D q:Tt, as 
desired. 
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If M is of the form (f, F), where f is non-empty, then we can assume inductively that 
+ Re|Ce, Ag D (¢, E): (he, Z) and that 


Wl € dom(f). F Ri] Ci, Ar D (1): h(), 


where dom(h) = dom(f), and VI € dom(h). h(1) = oj. Since, by assumption, GE succeeds, 
we can assume that h + hg is defined, and that S respects Cg and all the C), S is defined 
on Rg and all the R;, on Ag and all the A), and on (h+hg, Z). Therefore, S' is also defined 
on (hg, Z). 

Thus, by Lemma 3.6.7, we have that + SRe|SeCr,SAxg D (¢, E): S(he, Z) and that 


Wi € dom(f). | SRi| SC, 541 > f(D: S(A(D) 


Since, by assumption, GE(M) succeeds, we can assume that SAg U Ujedom(s) SAI is a 
well-formed set. Thus, by Lemma 3.6.3, Lemma 3.6.2 and Lemma 3.6.1, we can infer that 


FSReU (J SRilSeCeu |) SeC,SAgu [J SAID (¢, E): S(he, 2) 
ledom(f) ledom(f) ledom(f) 


and that 


Vie dom(f).+ SReU (J SRilSeCeu (J SeC,,SAgU [J SArD f(D): S(A(D) 
lédom(f) ledom(f) ledom(f) 


Since, by assumption, S(h + hg, Z) is defined, a use of (rec3) yields 


SReU |) SRilSeCeu [J SeC,SAguU [J SAD (f, E): S(h+ he, 2), 
ledom(f) ledom(f) lédom(f) 


as desired. 

If M is of the form M’N’, then we can assume inductively that + Ry|Cy,A1 > M’:0, 
and that + R2|C2,A2 D N’:r. Since, by assumption, GE(M) succeeds, we can assume 
that S respects C, and C2 and that S is defined on Ry, Ro, A1, Ao, T, and t. We first need 
to show that S is defined on o. 

Assume for the sake of contradiction that S is not defined on o. Since, by assumption, 
GE(M) succeeds, T unifies o = rt. Thus, it must be that MATCH(TC, U TC) is not 
defined on To, which implies that MATCH(TC, UTC) is not defined on T(7-t). But 
then, S is not defined on r—t, which leads to a contradiction. Thus, we can conclude that 
S$ is defined on o and unifies 0 = rt. 

By Lemma 3.6.7, we have that + SR,|S eC,,SA, > M’: So, and that k SR2|Se 
C2, 5 Az D> N’': St. Since, by assumption, GE(M) succeeds, we can assume that $A,US'A2 
is a well-defined set. Therefore, by Lemma 3.6.3, Lemma 3.6.2, and Lemma 3.6.1, we have 
that 

FSR, USRgU{ST} | SeCy US eC2,S5A,U S Ag D M'N': St St 


and that 
SR USR,U{Sr}|SeCy US eC2,SA,USA2,> M'N': Sr 


A use of rule (app) then yields 
SR, USR2U{Sr}|SeCy US eC2,$A,;USA2,> M'N': St, 


as desired. 
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If M is of the form fn P => M’, then we can assume inductively that + R,|Cp, Ap D P:¢, 
and that + R.|Ce, Ae > M':r. Since, by assumption, GE(M) succeeds, we can assume 
that S respects C, and C, and is defined on R,, Re, Ap, Ac, go and tT. Thus, by Lemma 
3.6.7, we have that # SR,|SeC,,SA, D P: So, and that SR.|SeCe,SAe D M': Sr. 
Since, by assumption, S unifies A, and Ae, we have that 


(SA.)[SAp] = S Ap U S Ae 


and is a well-defined set. Thus, by Lemma 3.6.3, Lemma 3.6.2, and Lemma 3.6.1, we can 


conclude that 
+t SR,USR.|S eC, U(SeC,.)?, SA, > P: So 


and that 
t SR,USR.|(SeC,)?US © Ce, (SAe)[SA,p] D M': Sr. 


By Lemma 4.5.1, we can infer that vars(A,) = vars(P). Since vars(S'Ap) = vars(Ap), 
a use of rule (abs) yields 


L SRpUSRe| (5 ¢ Cy)? US @ Ce, SAe D> fn P => M': S(o->7). 


Since vars(fn P > M') = vars(M')—vars(P), we have by Lemma 3.6.3 that vars(SA-) 2 
vars(M') — vars(P). Thus, since vars(SA_ — SA,) = vars(S A.) — vars(P) D vars(M") — 
vars(P), we have by Lemma 3.6.3 that 


+ SRpUSRe|(S eC)? US 0 Ce, SA. — SAp D fn P > M': S(o-7), 


proving the theorem. = 
We now formally state the main theorem concerning the soundness of GE with respect 
to the R-typing system. 


Lemma 4.5.3 If GE(M) succeeds and yields R|C,A > M:o, then every instance of 
R|C,AD M:o is provable. 


The proof follows easily by Lemma 4.5.2 and Lemma 3.6.7. 


4.6 Completeness 


This section proves Theorem 4.6.4, the second main theorem of this thesis, which shows 
that GE is complete with respect to the R-typing system and generates the most general 
restricted typing. More precisely, it shows that for any expression M, if a restricted typing 
for M is derivable in the R-typing system, then GE yields a restricted typing statement of 
which the provable restricted typing for M is an instance. 

In our proof of Theorem 4.6.4, we will need to use an induction on the structure of terms 
in ML; that is, for any typable expression M, we will show inductively that the antecedents 
in the derivation of a provable typing for M are instances of the typing yielded by GE on 
the corresponding subexpressions of M. However, due to the rule (coerce) there may be 
many possible derivations for a provable typing, and thus, our inductive reasoning becomes 
more complicated. In order to simplify this reasoning, we show that, as in [Mit84], every 
typing derivation may be put in a certain “normal form”, in which the rule (coerce) is used 
only after the axioms in the typing system. Reasoning about the normal form derivation of 
provable typings greatly simplifies our proof of Theorem 4.6.4. 
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Lemma 4.6.1 Suppose R|C,A > M:c is provable. Then there is a derivation with pre- 
cisely the same cut formulas in which rule (coerce) is used only immediately after the typing 
azioms (int), (real), (bool), (string), (var), (rec1), (rec2), and (const). 


Proof. The proof is similar to that of [Mit84]. We think of the proof of a restricted 
typing statement as a tree, where each leaf is an instance of the axiom (int), (real), (bool), 
(string), (var), (rec1), (rec2), or (const). We think of each node as labeled by both the 
restricted typing statement proved at that node and the final rule used in that proof. Given 
a proof of a restricted typing statement, we define the degree of the proof to be the number 
of pairs of internal tree nodes (a, 8) such that there is a path from a leaf through a to 8 
and node @ is labeled with (coerce). Intuitively, the degree of the proof gives a measure of 
how far the the occurrences of (coerce) are from the leaves. We note that a proof has degree 
zero iff the rule (coerce) is used only immediately after the axioms and show by induction 
on the degree of a proof that every provable statement has a proof of degree zero. 

If the final two rules used in the proof are both (coerce), then | R|C,AD M:o must 
have followed by the antecedents | R|C,A D> M:7 and C+ y C a for some y. Also, 
t R|C,AD M:7 must have followed by the antecedents + R|C,AD M:randCtrcy 


for some T. By (trans), we can infer that 
CkrCa. 


Therefore, we can derive | R|C,A > M:o directly from | R|C,A > M:r by (coerce), 
thus reducing the degree of the proof. 

If the final two rules used in the proof are (app) followed by (coerce), then M is of 
the form M = M'‘N' and + R|C,A D M:o must have followed by the antecedents + 
R|C,A > M'N':y and C+ y Co for some y. Also, for some tT, where R = R'U {r}, 
+t R|C,A > M’'N':y must have followed by the antecedents + R’|C,A D> M':r-7 and 
t R'|C,AD N':r. By (arrow), we can infer that 


Chkroy Cro. 


Thus, we can derive + R’|C, A > M’: ro directly from + R'|C, A D> M': ry by (coerce), 
and then proceed to derive + R|C,A D M'N’:o by (app), thus reducing the degree of the 
proof. 

If the final two rules used in a proof are (abs) followed by (coerce), then M is of the form 
M = fn P => M' and @ is of the form o = 01-02. Thus, + R|C,A> fn P > M': 0,02 
must have followed by the antecedents | R|C,AD fn P > M’:y1->72 and C+ y172 C 
01-02 for some 7; and 72. Also, + R|C,A D> fin P > M’:y1—72 must have followed by 
the antecedents F R|C°?, A’ > P:y, and + R|C,A[A] > M':72 for some A’ such that 
vars(A') = vars(P). By Lemma 3.5.2, 


Cro,Cy and CF ¥2 C 09; 


thus, C°P F y, © a1. Therefore, we can derive + R|C%, A!’ D P:o, directly from + 
R|C°?,A' D P:y by (coerce),and we can derive + R|C,A[A) D> M’:09 directly from 
t R|C,A[A’] > M’: 72 by (coerce). We can then proceed to derive R|C°?,A > fn P > 
M': 0,02 by (abs), thus reducing the degree of the proof by 1. 

If the final two rules used in a proof are (rec3) followed by (coerce), then M is of the 
form M = (f,E), and o is of the form o = (hk, Z), where dom(f) # ¢ and dom(h) # ¢. 
We can thus assume that, for some (h’, Z’) where dom(h’) = dom(h),+ R|C,AD M:a 
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must have followed by the antecedents # R|C,A > M:(h’,Z’) and Ck (h’, 2’) C (h, 2). 
Also, we can assume that  R|C,A > M:(h’, Z') must have followed by the antecedents 
VI € dom(f). | R|C,AD f()):hy/(1) and F R|C,A D (¢, E): (ha! Z’) for some hy’ and 
h', where h’ = hy’ + hg’. By Lemma 3.5.3, 


CE ($, 2) € (6, 2), 


and 


VI € dom(h). CF A(1) € h’(2). 


Since dom(h) = dom(h’) and dom(h,') N dom(hg’) = ¢, we can divide h into h = hi + ha, 
where dom(h;) = dom(hy’) and dom(h2) = dom(hg'). Since hz C h, we can infer by (record) 
that 

Ct (hg’, Z") C (ha, Z). 


We can then derivet R|C,A > (¢, E): (A, Z) directly from + R|C,AD (¢, E): (ha’, Z’) by 
(coerce). Moreover, for all 1 € dom(f), we can derive t R|C, AD f(l):hi(J) directly from 
+ R|C,AD f(l):hi’(D) by (coerce), since hy C h. We can then derivet R|C,A D> M:(h, Z) 
by (rec?), reducing the degree of the proof by 1, and proving the lemma. . 
In our proof of Theorem 4.6.4, we will need the following property about the action of 
GE on record expressions which contain no explicit labels: namely, that such records are 
always inferred to be of a record type whose first component is the empty function. 


Lemma 4.6.2 If GE((¢, E)) = R|C,AD (¢, E): (kh, Z), then dom(h) = ¢. 


The proof proceeds by simple case analysis. 
We will also need the following property of substitutions, which shows that, in a precise 
technical sense, substitution composition merges naturally with the operation e. 


Lemma 4.6.3 Suppose that Q is a set of type variables and row variables and that C is 
a set of possibly non-matching subtype assertions that contains only type variables and row 
variables from Q. Furthermore, suppose that S, T and W are substitutions such that T; S 
is defined andW =gT;S. IfW andT are matching substitutions for C, then 


1. S respects (T eC) 
2. Se(TeC)+ Wel 
3 WeCt Se(TeC) 


Proof. To prove (1), we note that, clearly, S is a matching substitution for TC. It 
is easy to show by induction on the structure of terms that S must respect T eC. 
To prove (2), we note that, by Lemma 3.5.5, TeC FTC. By Lemma 3.6.5, we have 
that 
Vo' Cr’ ETC. Se(TeC)t ATOMIC(So’ C Sr’) 


Thus, it follows that Va Cr EC. Se(TeC) + ATOMIC(S(Ta) € S(Tr)). Therefore, 
Se(TeC)t (T;S$)e¢C. Since C contains only type variables and row variables from Q, 
we have that W eC = (T;S) eC, as desired. 

To prove (3), we note that, clearly, (T; 5) respects C. By Lemma 3.5.5, it follows that 
Vo CT EC. (T;S)# {0 Cr}F (T;8){o CT}. 
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By induction on the structure of terms and by Lemma 3.5.2 and Lemma 3.5.3, it is easy 
to show that 


Vo CrEC.(T;S)e{o Cr} U {So' C Sr’} 
a creATOMIC(TecTr) 


By Lemma 3.5.5, 


Vo CTEC.(T;S)e{oCr}t UU ATOMIC(So’ C Sr’) 
o'CreATOMIC(TecT7r) 


Thus, we have that (T;S$)eCt Se(TeC). Since C contains only type variables and row 
variables from QO, W eC = (T;S)eC, proving the lemma. = 

Using the above lemmas as well as some of the lemmas developed in Chapter 3, we now 
prove our main theorem about the completeness of GE with respect to the R-typing system. 


Theorem 4.6.4 If | R|C,A D> M:7, then GE(M) succeeds and produces a restricted 
typing statement with R|C,A > M:7 as an instance. 


Proof. The proof is similar to [Mit84], except that we show that the substitution 
which provides the instance is defined on GE(M). We proceed by induction on the structure 
of terms. 

Suppose that M is an integer 6, and that Fk R|C,A D b:y. By Lemma 4.6.1, we can 
assume without loss of generality that the proof consists of a use of (int) followed by a use 
of (coerce). Thus, C + int C y. By Lemma 3.5.1, we can conclude that int matches 7. 

It is easy to see that GE(b) always succeeds. It remains to show that +t R|C,AD 6:7 is 
an instance of GE(b) by some substitution S. Let S = [7/t]. Clearly, S respects {int € t}. 
Thus, by Lemma 3.5.5, we can conclude that C+ S$ e {int C ¢t}, giving the proof of this 
case. The proof is analogous if b is a real, a boolean, or a string. 

Suppose that M is an variable z, and that + R|C,A > a:7. By Lemma 4.6.1, we can 
assume without loss of generality that the proof consists of a use of (var) followed by a 
use of (coerce). By Lemma 3.6.3, z must occur in A. Let 7 be the type expression with 
z:T € A, and note that Ck 7 C 7. By Lemma 3.5.1, y and 7 must match. 

It is easy to see that GE(x) always succeeds. It remains to show that R|C,AD a:7 
is an instance of GE(x) by some substitution S. Let S$ = [y/t, r/s]. Clearly S respects 
{s C t}; thus, by Lemma 3.5.5, we can conclude that C | S e{s C t}. Moreover, since 
z:T € A, it follows that A D S{z:s}, completing the proof of this case. 

Suppose that M is of the form (¢, EMPTY) and that + R|C,AD M:y. By Lemma 
4.6.1, we can assume without loss of generality that the proof consists of a use of (rec1) 
followed by a use of (coerce). Thus, C + (@, NULL) C y. By Lemma 3.5.1, we can conclude 
that (¢, NULL) matches y. Thus, y = (¢, Z) for some Z. 

It is easy to see that GE((¢, EMPTY)) always succeeds. It remains to show that 
R|C,A Dd (¢, EMPTY):7 is an instance of GE((¢, EM PTY)) by some substitution S. 
Let 

S = [(¢, Z)/4] 


Since y = S(¢,%¥), it follows that S respects {(¢, NULL) C (¢,4)}. Thus, by Lemma 

3.5.5, we have that Ct Se {(¢, NULL) C (¢, X)}, completing the proof of this case. 
Suppose that M is of the form (¢,u) and that t R|C,A D> M:7y. By Lemma 4.6.1, we 

can assume without loss of generality that the proof consists of a use of (rec2) followed by 
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a use of (coerce). By Lemma 3.6.3, u must occur in A. Let (h, Z) be the type expression 
with u:(h,Z) € A, and note that C + (h,Z) C 7. By Lemma 3.5.1, y and (hk, Z) must 
match. Thus, y = (h’,Z’) for some h' and 2’ such that dom(h) = dom(h’) and VI € 
dom(h). h(1) matches h'(1). 

It is easy to see that GE((¢, u)) always succeeds. It remains to show that R|C,A D 
(¢, u): 7 is an instance of GE((¢, u)) by some substitution 5. Let 


S = [(h', Z'\/¥, (h, Z)/X] 


Clearly, S respects {(,¥) C (¢,))}; thus, by Lemma 3.5.5, we can conclude that CF 
Se {(¢,¥%) © (¢,¥)}. Moreover, since u:(h, Z) € A, it follows that A D S{u:(¢, X)}, 
completing the proof of this case. 

Suppose that M is a built-in constant q and that + R|C,A D q:y. By Lemma 4.6.1, 
we can assume without loss of generality that the proof consists of a use of (const) followed 
by a use of (coerce). Thus, we can assume that, for some substitution V that is defined on 
Tg, (where 7, is the built-in type for g), C | Vr, C y. By Lemma 3.5.1, Vr, and y must 
match. 

Let QO’ = QU{t}. Also, let W be the substitution with dom(W) = Q’ and such that, for 
all type variables and row variables in Q, W behaves identically to V; furthermore, Wt = y. 
Clearly, W is defined on tT, and Wr, = V7q; therefore, W is a matching substitution for 
{t, € t}. By Lemma 4.3.6, MATCH({7, C t}, Q’) succeeds, and W =g, T;U for some 
substitution U where 7; U is defined. Also by Lemma 4.3.6, dom(T) C Q’; thus, T{Q’ = T. 
Therefore, since dom(W) = Q’, 


W =T;U', where U' = U}(range(T)U Q’). 


It remains to show that R|C,A D q:7 is an instance of GE(g) by some substitution 
S. Let S = U’. Then, by Lemma 4.6.3, S respects Te {r, € t}, and We{r, Ct} F 
Se(Te{r, € t}). Since We {7, C t} = ATOMIC(V7, C ¥), we have by Lemma 3.5.5 that 
CE W e{r, € t}, completing the proof of this case. 

Suppose that M is of the form M'N’ and that +! R|C,AD M:+7. By Lemma 4.6.1, we 
can assume without loss of generality that the proof ends in a use of (app). Thus, for some 
1; 

F R’|C,A D> M':7'>y7 
and 
t R'|C,A D> N’:7’, 
where R’ = RU {7'}. We can assume inductively that R’|C,A > M’':7'—7 is an instance 
of GE(M') = Ry|C\, A, > M’:o and that R’|C,A > N':7' is an instance of GE(N’) = 
R2| C2, Az > N':r. Thus, there exist substitutions V; and V2 such that V; respects Cy and 
is defined on Ry, Ai, and o, V2 respects C2 and is defined on Ro, Az, and 7, and such that 


R' 2 WR, Cr Vy C1, A 2 ViA1, '7 = Vio 


and 
R’ 2) V2Ra, Cc b V2 e Co, A =) V2 Ao, ae = Vor. 


We note that no type or row variable in Q, appears in Qo, and that ¢ is not contained 
in Q; or Qo. Thus, there exists a well-defined substitution V whose domain is {t}U Q; UQ2 
and such that V maps any type variable s in its domain to Vis if s € Q; and to Vos if 


51 


s € Qo. (The analogous definition holds for row variables in the domain of V.) Also, V 
maps t to 7. 
It is easy to see that V respects C, and C2 and is defined on Ri, Rz, Ai, A2, o and T, 
and that 
R' DVR, Ch Ve, ADVA, y'—7 = Vo 


and 
R'DVRo, CHV eCz, ADVAg, ¥' = Vr. 


Therefore, V must unify {a = 8 | w:a € A; and w:f € Ap}. Also, since Vt = y, we 
have that Vo = Vr-Vt = V(r-—t); thus, V unifies o = rot. Let Q = Q; U Q2 U {i}. 
By Lemma 4.2.1, it follows that T = UNIFY({a = 6 | w:a € Aj and w:8 € Ag} U{o = 
Tt}, Q) succeeds, and that V =g T; U for some substitution U such that T; U is defined. 
Equivalently, V =g T;U’, where U’ = U}(range(T) U Q). 

Since only type and row variables from Q occur in Cy and Ca, clearly U’ respects (TC U 
TC2). Therefore, by Lemma 4.3.6, we can assume that MATCH(TC, UTC, range(T) U Q) 
succeeds. Furthermore, letting T’ = MATCH(TC, U TCo, range(T) U Q), we have that 
U' =,ange(T)vQ T’;W for some substitution W where 7’; W is defined. Since dom(U") C 
(range(T) U Q), and since, by Lemma 4.3.6, dom(T’) C (range(T’) U Q), we have that 


U' = (T’; W){(range(T) U Q) = T’; W}(range(T’) U range(T) U Q). 


Thus, since T;U’ is defined, so is T;(7’; W }(range(T’) U range(T) U Q)), and thus so is 
S$ =T;T'. Therefore, V =g T;T';W =g S;W. 

We recall that all the type variables and row variables occurring in any of C1, Co, Ri, 
Rz, Ay, Ao, 7, T, and t, are contained in Q. Thus, S' respects C, and C2 and is defined 
on Ry, R2, Ai, Ag, o and 7. Since T unifies Ay and Az, so does S$. Thus, $A; USA isa 
well-formed set. Furthermore, since V =g 5S; W, it is easy to see that W is defined on S Rj, 
SR2, SA1, SA2, So and Sr. By Lemma 4.6.3, we have that W respects S eC, and S$ eC 
and that CK We(SeCy US eC). 

Clearly, W(St) = 7, AD W(SA,U SA), and R’ > W(SR,USR2). Since Vr = 7’, we 
have that R = R’UW(Sr). Thus, R > W(SRyUSR2UST). Therefore, R|C,A > M'N’: 7 
is an instance of GE(M’‘N’) by substitution W. 

Suppose that M is of the form fn P > M' and+ R|C,AD M:7. By Lemma 4.6.1, we 
can assume without loss of generality that the proof ends in a use of (abs). Thus, for some 
A’, 71, and 72, where vars( A’) = vars(P) and y = 71-72, 


kt RIC®?, A’ > P:7; 


and 
+ R|C, A[A] D M's y2. 


We can thus assume inductively that R|C°?,A’ D> P:y, is an instance of GE(P) = 
R,|Cp, Ap > P:o and that R|C, A[A] D M’: 72 is an instance of GE(M) = R-|C., Ae D 
M':r. Thus, there exist substitutions V, and V, such that V, respects C, and is defined on 
R,, Ap, and o, V. respects C, and is defined on R,, Ae, and 7, and such that 


R2DV,Ry, CPt Vy e Cy, A’D VpAp, Y1 = Vo 


and 
R2IV-Re, Cl Veo Ce, A[A)] > VeAe, ¥2 = VeT- 
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Since no type or row variables in Q, appear in Q., we can define the substitution V 
whose domain is Q, U Q, and such that V maps any type variable ¢ in its domain to Vpt if 
t € Q, and to V.t ift € Q,. The action of V on row variables in its domain is analogous. It 
is easy to see that V respects Cy and C¢, is defined on Rp, Re, Ap, Ae, o and T, and that 


RDVR,, CP? VeC,, A’ DVA,, 71 = VO 


and 
RDVR., CEKVeC,, AJA) DV Ac, Yo = Vr. 


By Lemma 4.5.1, vars(A,) = vars(P). Thus, A’ = VAp, and thus, A[VA,] > VA-. This 
implies that if w: Va € VA, and w:V8 € VAy, then Va = VB. Let O = Q,U Q. By 
Lemma 4.2.1, it follows that UNIFY({a = @ | w:a € A, and w: 6 € Ae}, Q) succeeds, and 
that V =g T;U for some U such that T;U is defined. Equivalently, V =g 7; U’, where 
U' = Ul (range(T) VU Q). 

Since only type and row variables from Q occur in C, and C-, clearly U respects (TCU 
TC-). Thus, by Lemma 4.3.6, we can assume that that MATCH(TC, UTC., range(T)U Q) 
succeeds. Furthermore, letting T’ = MATCH(TC, UTC,, range(T) U Q), we have that 
U' =,ange(T)uQ T’;W for some substitution W where T’;W is defined. Since dom(U’) © 
(range(T) U Q), and since, by Lemma 4.3.6, dom(T’) C (range(T) U Q), we have that 


U’ = (1"; W) t(range(T) U Q) = T'; Wt (range(T’) U range(T) U Q). 


Thus, since 7; U' is defined, so is T;(T’; W}(range(T’) U range(T) U Q)), and thus so is 
S =T;T’. Therefore, V =g T;T’;W =g S;W. 

We recall that all the type variables and row variables occurring in any of Cy, Ce, Rp, 
Re, Ap, Ae, o, and T, are contained in Q. Clearly, S respects C, and C, and is defined 
on Rp, Re, Ap, Ae, o and 7. Since T unifies A, and A., so does S. Thus, SA, — SAp is 
a well-defined set. Furthermore, since V = S;W, it is easy to see that W is defined on 
SRp, SRe, SAp, SAe, So and Sr. By Lemma 4.6.3, W respects S eC, and S eC, and 
CrKrWe(SeC,? US eC,). It is easy to see that 


W(So>Sr) = y1->72, and RD W(SR,USR,). 


Since A’ = W(S'A,), we know that A[A’] = Ay, UW(SA,), where A = A, U{w:a|wiaé 
Aandw:8 € W(SA,)}. (Note that these are “disjoint” unions, since these sets are 
well-formed.) If wa € W(SA. — SA,), then w ¢ vars(W(SA,)). Thus, since A[A’] D2 
W(SA.) D W(SA,. — SA,), w:a € Ay. Thus, w:a € A, and therefore A D W(SA, — S'Ap), 
as desired. 

Thus, R|C, AD fn P > M’:7 is an instance of GE(fn P => M’) by substitution W. 

Suppose that M is of the form (f,£), where f is non-empty, and + R|C,A D M:7. 
By Lemma 4.6.1, we can assume without loss of generality that the proof ends in a use of 
(rec3). Therefore, we can assume that for some hy, hz, and 2’ where h, + hg is well-defined, 
dom( f) = dom(h1), and y = (hi + he, 2’), 


+ R|C,AD (¢, E): (he, Z') 


and 
Vl € dom(f). FE R|C,AD f(l): Ai). 
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We can thus assume inductively that R| C, AD (¢, E): (ha, 2’) is an instance of GE((¢, E)) = 
Re|Cr,Ag > (¢, E): (hg, Z) and that for all 1 € dom(f), R|C,A D f(i): hi(D is an in- 
stance of GE(f(1)) = Ri| Ci, Ar D f(D: hd). 

Thus, there exists a substitution Vz such that Vg respects Cg and is defined on Rg, 
Ag, and (hg, Z) and such that 


R2VeRg, C+ VeeCe, AD VEAE, i) = Ve(hs, Z). 


Also, for all ! € dom(f), there exists a substitution V; such that V; respects C; and is defined 
on R;, A;, and A(l) and such that 


R 2 Vi Ri, Cr Vie Ci, A 2 ViAl, hy(1) = Vi(h(1)). 


Since the type and row variables that occur in Og and in the Q; are all distinct there 
is a well-defined substitution V whose domain is Q¢ U Usedomis) Qi, and such that for all 
type variables t in its domain, Vt = Vgt if t € Qn, and, for any ! € dom(f), Vt = Vit if 
t € Q;. An analogous definition holds for the action of V on row variables in its domain. 

It is easy to see that V respects Cg and is defined on Rg, Ag and (hg, Z) and that, for 
all 1 € dom(f), V respects C; and is defined on R;, A;, and h(1). Moreover, we have that 


R2IVReE,CtKVeCge, ADVAR, (ho, 2’) = V(hg, Z). 
and for all 1 € dom(f), 
RIVR, ChKV eC, ADVA), hy (l) = V(A(D). 


Let Q = Qe VU Uledomf) Qi. Since V must unify Ag and all of the Aj, we can con- 
clude by Lemma 4.2.1 that T = UNIFY({a = 6 | w:a € Aj and w:6 € Ap, forl F 
I’, LU € dom(f), or w:a € Aj and w:8 € Ag, for | € dom(f)}, Q) succeeds. Further- 
more, V =g T;U for some U such that T;U is defined. Equivalently, V =g T; U’, where 
U' = US(range(T) U Q). 

Since only type and row variables from Q occur in Cg and all the C), clearly U 
must respect TC’, and all of the TC). Thus, by Lemma 4.3.6, we can assume that that 
MATCH(TCE VU Uledom(s) 7Ci, range(T) U Q) succeeds. Furthermore, letting 


T’=MATCH(TCeU |) TC, range(T)U Q), 
lédom(f) 


we have that U’ =,ange(r)uq 1"; W for some substitution W where T’; W is defined. Since 
dom(U") C (range(T) U Q), and since, by Lemma 4.3.6, dom(T’) C (range(T) U Q), we 
have that 


U' = (1'; W)t(range(T) U Q) = T’; Wt(range(T’) U range(T) U Q). 


Thus, since T; U' is defined, so is T;(T’; W(range(T’) U range(T) U Q)), and thus so is 
S=T;T'. Therefore, V =g T;T’;W =g S;W. 

We recall that all the type variables and row variables occurring in Cg, Ag, Re, (hz, Z), 
any of the Ci, R:, Aj, o7, are contained in Q. Thus, S respects Cg and all the C;, and is 
defined on Rg and all the R;, on Ag and all the A;, and on (Ag, Z) and all the A(l). Since 
T unifies Ag and all the Aj, so does $; thus, SAg U Usedomif) SAl is well-formed. 
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It is easy to see that W is defined on S Rg and all the SR;, on SAg and all the SAi, 
and on S(hg, Z) and all the S(A(/)). By Lemma 4.6.3, W respects SeC'g and all the SeC; 
and 

ChWe(SeCeu [J Se). 
ledom(f) 


Clearly, 

R2W(SRevV |J SR), ADW(SAgU |) SA). 
ledom(f) ledom(f) 

All that remains to be shown is that h + hg is well-defined, S is defined on (h+ hg, Z), 
W is defined on S(h+ hg, Z), and that (hy + he, 2’) = W(S(h+ hg, Z)). 

By Lemma 4.6.2, hy = ¢. Thus, (kh + hz, Z) = (h,Z), and is well-formed. Then, 
Vi(h+heg, Z) = (hj V+left(V(¢, Z)), right(V(d, Z))) = (hithe, Z). Since, by assumption, 
hy + hg is well-defined, so is V(h+hg, Z). Then, sois S(h+hg, Z) and W(S(h+ hg, Z)). 
Furthermore, W(S(h + hg, Z)) = (hy + ha, Z). 

Thus, R|C,A D (f, £):7 is an instance of GE((f, E)) by substitution W, proving the 


theorem. = 


4.7 Relating GE to the Original Typing System 


In this chapter, we have shown that the type inference algorithm GE is sound and complete 
with respect to the R-typing system. However, the reader may wonder how GE relates to 
the original typing system defined in Chapter 3. 

If we say that an unrestricted typing C, A > M:o is an instance of R'|C’, A! > M:o' by 
substitution S whenever S$ is defined on R’ and C, A > M:c is an instance of C’, A’ > M:o' 
by S, we have the following corollaries for soundness and completeness. 


Corollary 4.7.1 Suppose that GE(M) succeeds and produces a restricted typing statement 
R|C,AD Mia. If C', A’ > M:o' is an instance of R|C,A > M:a, then' C’, A’ D> M:o'. 


The proof follows easily by Theorem 4.5.2 and Lemma 3.6.7. 


Corollary 4.7.2 If C,A > M:o, then GE(M) succeeds and produces a restricted typing 
statement with C',A > M:o as an instance. 


The corollary is proved by noting that if C, A > M:a is provable, then there is a proof 
using some set of cut formulas. 

Therefore, given an expression M, GE computes a most general restricted typing state- 
ment for M such that C,A > M:o is provable in the original typing system iff it is an 
instance of this most general restricted typing statement. 
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Appendix A 


Unification: Algorithm and Proof 
of Correctness 


This appendix is devoted to proving Lemma 4.2, which states that an algorithm UNIFY 
exists that, given a set E of equations between types and a co-infinite set QO of type variables 
and row variables, computes a most general unifier for E of equations with respect to Q. In 
this appendix, we present the algorithm UNIFY, which uses an algorithm UNIFIER that 
we also present here. In order to prove that UNIFY is correct, we use some properties 
about most general unifiers for equations between certain types. We prove these properties 
in Lemmas A.1, A.2, and A.3, and then use these lemmas to prove Lemma 4.2. 

Our algorithm UNIFIER is quite similar to the unification algorithm of [Wan87]; how- 
ever, it fixes a bug in Wand’s algorithm for the case of unifying two extended records. As 
we will discuss in the proof of Lemma 4.2, Wand’s termination proof is incorrect, since the 
halting measure does not necessarily decrease after every iteration of the algorithm. We fix 
this bug by using some of the properties of unification developed in Lemma A.3 in defining 
our algorithm UNIFIER, and we prove that UNIFIER does terminate. 

The algorithm UNIFIER is defined as follows: 


UNIFIER(¢, Q) = ¢ 


UNIFIER(E’ U {c, = cz}, Q) = 
if cy # c2 then fail 
else UNIFIER(E’,Q) 


UNIFIER(E’ U {0102 = 172}, Q) = 
UNIFIER(E’ U {01 = 1, 02 = 72}, Q) 


UNIFIER(E’ U {t = T}, Q) = 
ift = 7 then UNIFIER(E’,Q U {t}) 
else if t occurs in 7 then fail 
else let O’ = OU {t} U {the type and row variables in r} 
in [7 /t]; UNIFIER([r/t]E’, Q’) 


UNIFIER(E’ U {(h, NULL) = (h', NULL)}, Q) = 
if dom(h) 4 dom(h’) then fail 
else UNIFIER(E’ U Ujedom(ny{h(1) = h'(D}, Q) 
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UNIFIER(E’ U {(h, ¥) = (h',¥)}, Q) = 
if dom(h) # dom(h’) then fail 
else UNIFIER(E’ U Usedom(ny {h(L) = h’(D)}, QU {¥}) 


UNIFIER(E’ U {(h, ¥) = (h', NULL)},Q) = 
if dom(h) Z dom(h’) then fail 
else let hy = h’{(dom(h’) — dom(h)) 
in if ¥ occurs in (hy, NULL) then fail 
else let QO’ = QU {A’} U {the type and row variables in the range of h,} 
in [(ha, NULL) /2]; UNIFIER([(h1, NULL) /&\(E"U Uredom(n {h(1) = h'(D}), 2) 


UNIFIER(E’ U {(h, ¥) = (h', ¥')}, Q) = (where ¥ # 4’) 
let hy = h’t(dom(h') — dom(h)) 
ho = ht (dom(h) — dom(h’)) 
Qr = the type and row variables in E’U {(h, ¥) = (h’, ¥’)} 
Y be a fresh row variable with respect to QU Or 
in if ¥ occurs in (hy, ) or X’ occurs in (h2, Y) 
OR 
if Y occurs in (hg, ) and X’ occurs in (hy, Y) 
then fail 
else let V = [(hi, )/4¥, (ha, Y)/A); [(hi, Y)/%, (ho, Y)/*"] 
’= QU{X,X'} U {the type and row variables in the range of hy or ho} 
in V; UNIFIER(V(E" U Usedom(h)ndom(ny{h() = h'(1)}), 2") 


UNIFIER(E’ U {0 = rT}, Q) = 
if o = r is an equation between any other types 
then fail 


Using this algorithm UNIFIER, we define our algorithm UNIFY, which simply checks 
that the substitution returned by UNIFIER is defined on all the types in the original set of 
equations. We define UNIFY as follows: 


UNIFY(E, Q) = 
let S =UNIFIER(E, Q) 
if, for allo =7 € E, S' is defined on o and r 
then return $ 
else fail 


Before proceeding to prove the correctness of UNIFY, we first develop some properties 
of unification that we will need in that proof. The first such property concerns unifying a 
type variable with a type. 


Lemma A.1 A substitution T unifies E’U{t = 7} ifft=7 andT unifies E’ OR 


1.t#T and 
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2. t does not occur int and 


3. there exists a substitution T' such that [r/t];T' is defined, and T = [r/t];T’, and T’ 
unifies [7 /t]E’. 


We omit the proof, as it is quite similar to [Rob65]. 


We also need the following property about unifying an extended record with a fixed record. 
Lemma A.2 A substitution T unifies E’U {(h, ¥) = (h', NULL)} iff 

1, dom(h) C dom(h’) and 

2. X does not occur in (hi, NULL), where hy = h'{(dom(h’) — dom(h)) and 

3. [(a1, NULL) /X\(E'U Upedom(n {h(L) = h'(D)}) is defined and 


4. there exists a substitution T’ such that [(hi, NULL)/X];T’ is defined andT = [(hy, NULL) /X];T' 


and T! unifies ((h1, NULL)/X\(E'U Uredom(ny (0) = h'(D}) 


Again, we omit the proof, as it uses the same sort of reasoning as the proof of the Lemma 
A.l. 


The final property that we will need concerns unifying two extended records. 


Lemma A.3 Suppose that E = E'U {(h,&¥) = (h',X")}. Let OQ be a co-infinite set of 
type variables and row variables, and let Qr be the set of type variables and row variables 
occurring in E. Also, let V = [(hy,¥)/¥, (he, V)/¥]3[(hi, V)/X, (ha, Y)/X"), for some row 
variable Y that is fresh with respect to QU Qr. A substitution T unifies EF iff 


1. X does not occur in (hy, Y), where hy = h't(dom(h’) — dom(h)) and 

2. X' does not occur in (hz, ¥), where hg = h{(dom(h) — dom(h’)) and 

3. either XY does not occur in (h2,) or X' does not occur in (hy, Y) and 

4. V is defined and 

5. V(E'U Utedom(h)ndom(ny{h(2) = h/(1)}) is defined and 

6. there exists a substitution T’ such that V;T’' is defined and T =gug, V;T’ and T' 
unifies V(E"U Usedom(h)yndom(ary{h(2) = h'(1)}) and 

7. T is defined on (h, ¥) and (h’, X'). 


Again, we omit the proof here. 


Using the lemmas developed above, we now give the proof of Lemma 4.2. 


Lemma 4.2 Let E be a set of equations of the form o = r and let Q be a co-infinite set 
of type variables and row variables. Let Qp be the set of type variables and row variables 
that occur in E. There exists an algorithm UNIFY such that whenever there is a unifying 
substitution for E, UNIFY(E, Q) produces a most general unifying substitution with respect 
to QU Qk. Otherwise, UNIFY(E, Q) fails. 


Proof. In order to prove the lemma, it is sufficient to show the following three 
properties of UNIFIER: 
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1. UNIFIER(E, Q) halts. 


2. If there is asubstitution T that unifies EF, then UNIFIER(F, Q) succeeds and produces 
a substitution S$ such that S unifies £. Furthermore, there exists a substitution U 
such that 5; U is defined and T =gug, 5; U. 


3. If there is no unifying substitution for F, then UNIFIER(£, Q) either fails or produces 
a substitution S such that for some o = T € E, S is not defined on at least one of a 


and fr. 


In order to prove (1), we define the degree of a set E to be the pair (m,n), where m is the 
number of distinct type variables and row variables that occur in E, and n is the number 
of occurrences of —, () plus the total number of (possibly non-distinct) type variables, row 
variables, ground types, and NULL that occur in &. We say that (m,n) is smaller than 
(m',n') if either m < m’, or m = m’ and n < n’. We will show that each clause of the 
algorithm reduces the degree of the set F that is passed around. We note here that a set 
E has degree (0,0) iff E is empty. 

The proof proceeds by induction on the structure of o = T, where EF = E’U {a =7T}, 
and (m,n) is the degree of EF. 

Suppose that o = 7 is of the form cj = cz. To prove (1), we note that, clearly, the 
degree of E’ is (m,n — 2). To prove (2) and (3), we note that the degree of E’ is smaller 
than the degree of E, and that OQ = Q'. The proof of (2) and (3) then follows easily from 
the inductive hypothesis. 

Suppose that o = 7 is of the form o,-02 = 7-+T2. To prove (1), we observe that, 
clearly, the degree of E’ is (m,n —2). To prove (2) and (3), we first note that a substitution 
T unifies FE iff T unifies E’U{o, = 71,02 = T2}. We also note that the degree of E’ U{o, = 
7,02 = T2} is smaller than the degree of E, and that both of these sets contain the same 
type variables and row variables. The proof of (2) and (3) again follows easily from the 
inductive hypothesis. 

Suppose that o = 7 is of the form (h, NULL) = (h', NULL). To prove (1), we observe 
that, clearly, the degree of E’ is (m,n — 4). To prove (2) and (3), we first note that a 
substitution T unifies E iff dom(h) = dom(h’) and T unifies (E"U Usedomny {2() = A’(D)})- 
We also note that the degree of (E’ U Uledom(ny{h(!) = A’(1)}) is smaller than the degree 
of £, and that both of these sets contain the same type variables and row variables. The 
proof of (2) and (3) again follows easily from the inductive hypothesis. 

Suppose that o = 7 is of the form (h, V) = (h’, V). To prove (1), we note that, clearly, 
the degree of E’ is (m,n — 4). To prove (2) and (3), we first note that a substitution T 
unifies E iff dom(h) = dom(h’) and T unifies (E’UUjedominy{h(!) = h’(1)}) and T is defined 
on (h, A’) and (h’, 4X’). We also note that the degree of (£’U Ujedomny{h(J) = A’(1)}) is 
smaller than the degree of F, and that Og = Q(B pedomeny HD=A'(}) U{4}. The proof of 


(2) and (3) then follows easily from the inductive hypothesis. 

Suppose that o = 7 is of the form t = r. To prove (1), we first note that if UNIFIER 
succeeds then either t = 7 or t does not occur in r. In the first case, the degree of E’ is 
clearly (m,n — 2); in the second case, the first component of the degree of [7 /t]E’ is at most 
m-—1. To prove (2) and (3), we use Lemma A.1 and again proceed by case analysis. In 
the case that ¢ = T, we note that the degree of EF’ is less than the degree of F, and that 
Q~ = Qe U {t}. In the case that ¢ # 7, we note that the degree of [r/t]E’ is less than 
the degree of EF, and that Qe = Qy,/yx U {t} U {the type and row variables in r}. Using 
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these properties and Lemma A.1, the proofs of (2) and (3) for both cases follow from the 
inductive hypothesis. 

Suppose that o = 7 is of the form (h, 1’) = (h’, NULL). To prove (1), we observe that 
if UNIFIER succeeds, then the first component of the degree of [(hy, NULL)/AE’ is at 
most m—1. To prove (2) and (3), we use Lemma A.2. We first note that the degree of 
[(hi, NULL) /*JE’ is less than the degree of E, and that Q¢ = Qya,,nuLz)sayE U {¥} U 
{the type and row variables in the range of h,}. To prove (2), we also note that, if there 
is a substitution that unifies F, then, since (hi, NULL) contains only type and row vari- 
ables from Qp, Lemma A.2 implies that [(hy, NULL) /%X]; UNIFIER([(hi, NULL)/A](E"U 
Uledom(ny{h(!) = h'(I)}), Q") is defined. Using these properties and Lemma A.2, the proofs 
of (2) and (3) follow from the inductive hypothesis. 

Suppose that o = 7 is of the form (h, 4’) = (h', X’). To prove (1), we observe that if UNI- 
FIER succeeds, then the first component of the degree of V(E’) is at most m—1. To prove (2) 
and (3), we use Lemma A.3. We first note that the degree of V(£’) is less than the degree of 
E,and that Qg = Qin, ,nuLz)/xe/U{the type and row variables in the range of hy or ha}U 
{¥,X'}. To prove (2), we also note that, if there is a substitution 7 that unifies F, then, 
since (hy, Y) and (hz, Y) contain only type and row variables from Q¢, Lemma A.3 implies 
that V; UNIFIER([(A1, NULL)/X\(E'UUedom(n)ndom(ny{h(l) = h'(1)}), Q’) is defined. By 
the inductive hypothesis, we can deduce that this substitution is equal to T with respect to 
QU Qz, and thus this substitution is defined on (h, V) and (h’, V’). Using these properties 
and Lemma A.3, the proofs of (2) and (3) follow from the inductive hypothesis, proving the 
lemma. 

As mentioned at the beginning of the appendix, our algorithm UNIFIER corrects a bug 
in the unification algorithm of [Wan87]. Wand’s algorithm uses the substitution 


[(hi, Y)/X, (ho, y/*')) 
instead of the substitution 
[(Ay, Y)/X, (ha, Y/X')); [(hi, V)/4, (ha, y/ x’) 


which we call V. However, applying the simpler substitution used by Wand to the set 
(E"U Uledom(h)ndom(ny {h(l) = h’(1)}) does not necessarily decrease the halting measure, 
since 4’ may occur in hy or Y may occur in hg. We correct this difficulty in Wand’s 
algorithm through our use of the substitution V. . 
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